U.S. patent application number 14/175569 was filed with the patent office on 2014-06-05 for computing device, a system and a method for parallel processing of data streams.
This patent application is currently assigned to CORTICA LTD.. The applicant listed for this patent is CORTICA LTD.. Invention is credited to Karina Odinaev, Igal Raichelgauz, Yehoshua Y. Zeevi.
Application Number | 20140156901 14/175569 |
Document ID | / |
Family ID | 50846725 |
Filed Date | 2014-06-05 |
United States Patent
Application |
20140156901 |
Kind Code |
A1 |
Raichelgauz; Igal ; et
al. |
June 5, 2014 |
COMPUTING DEVICE, A SYSTEM AND A METHOD FOR PARALLEL PROCESSING OF
DATA STREAMS
Abstract
An apparatus for identification of an input data against one or
more learned signals is provided. The apparatus comprising a number
of computational cores, each core comprises properties having at
least some statistical independency from other of the
computational, the properties being set independently of each other
core, each core being able to independently produce an output
indicating recognition of a previously learned signal, the
apparatus being further configured to process the produced outputs
from the number of computational cores and determining an
identification of the input data based the produced outputs.
Inventors: |
Raichelgauz; Igal; (Ramat
Gan, IL) ; Odinaev; Karina; (Ramat Gan, IL) ;
Zeevi; Yehoshua Y.; (Haifa, IL) |
|
Applicant: |
Name |
City |
State |
Country |
Type |
CORTICA LTD. |
Ramat Gan |
|
IL |
|
|
Assignee: |
CORTICA LTD.
Ramat Gan
IL
|
Family ID: |
50846725 |
Appl. No.: |
14/175569 |
Filed: |
February 7, 2014 |
Related U.S. Patent Documents
|
|
|
|
|
|
Application
Number |
Filing Date |
Patent Number |
|
|
12084150 |
Apr 7, 2009 |
8655801 |
|
|
PCT/IL2006/001235 |
Oct 26, 2006 |
|
|
|
14175569 |
|
|
|
|
Current U.S.
Class: |
710/305 |
Current CPC
Class: |
G06N 3/049 20130101;
G06N 3/063 20130101 |
Class at
Publication: |
710/305 |
International
Class: |
G06F 13/40 20060101
G06F013/40 |
Claims
1. An apparatus for processing a data stream, comprising: a
processing unit comprised of a plurality of computational cores,
each computational core is configured to receive an input data and
provide a unique output data, each computational core is randomly
programmed prior to receiving the input data to produce the unique
output data respective of the input data, wherein at least two of
the plurality of computational cores operate in parallel; an input
interface configured to receive the data stream and simultaneously
provide the received data stream to each of the inputs of the
plurality of computational cores; and an output interface configure
to simultaneously receive the output data from each of the
plurality of computational cores.
2. The apparatus of claim 1, further comprising: at least one
register configured to be updated responsive of the output
data.
3. The apparatus of claim 2, wherein the at least one register
contains at least one of: a mode of operation of the plurality of
the computation cores, the input data, the output data, and an
outcome indication respective of the output data.
4. The apparatus of claim 3, wherein the outcome indication is at
least one of: winner-takes-all, majority voting, and statistical
analysis.
5. The apparatus of claim 3, wherein the outcome indication is
provided respective of a determination whether the plurality of
computational cores identified the input data.
6. The apparatus of claim 1, wherein the input data comprises
temporal data.
7. The apparatus of claim 6, wherein the temporal data comprises at
least a segment of binary streaming data.
8. The apparatus of claim 1, wherein the apparatus is configured to
be asynchronously adaptive with respect of the input data.
9. The apparatus of claim 1, wherein the apparatus is configured to
perform respective of the input data a task including at least one
of: filtering unknown data streams, image recognition, speech
recognition, clustering, indexing, routing, video signals analysis,
video indexing, categorization, string matching, recognition tasks,
verification tasks, tagging, and outliner detection.
10. The apparatus of claim 1, wherein the input data comprises at
least one of: signals, streams of signals, string, regular
expression, sensor output signals, database records, processor
outputs, naturally structured signals, speech signals, image
signals, physiological signals, medical signals, and text
signals.
11. The apparatus of claim 1, wherein randomly programming each of
the plurality of computational cores comprises: preprogramming each
computational core with a respective function.
12. The apparatus of claim 11, wherein a statistical distribution
is used to generate the respective function for each of the
plurality of computational cores.
13. The apparatus of claim 12, wherein the statistical distribution
is a Gaussian distribution.
14. The apparatus of claim 1, wherein the plurality of
computational cores is divided to a plurality of subgroups of
computational cores, wherein each of the subgroups of computational
cores is configured for mapping variants of the input data.
15. The apparatus of claim 1, wherein the processing unit is
configured to capture at least one intrinsic dimension of the input
data.
16. The apparatus of claim 1, wherein at least one of the
computation cores of the processing unit is a leaky
integrate-to-threshold unit.
17. The apparatus of claim 16, wherein at least one of the
computation cores of the processing unit is a coupling node
unit.
18. The apparatus of claim 17, wherein the coupling node unit is
configured to connect between at least a first leaky
integrate-to-threshold unit and a second leaky
integrate-to-threshold unit.
19. An integrated circuit comprising the apparatus of claim 1.
20. A method for processing a data stream, comprising: randomly
programming a plurality of computational cores of a processing
unit; configuring at least two of the plurality of computational
cores to operate in parallel; receiving the data stream;
simultaneously providing the received data stream to each of the
inputs of the at least two computational cores; and configuring of
the at least two computational cores to receive data stream and to
provide a unique output data respective of the input data stream.
Description
CROSS REFERENCE TO RELATED APPLICATIONS
[0001] This application is a continuation of U.S. patent
application Ser. No. 12/084,150, having a filing date of Apr. 7,
2009, now U.S. Pat. No. 8,655,801, which is a National Phase of PCT
Patent Application No. PCT/IL2006/001235 having International
filing date of Oct. 26, 2006, which claims the benefit of Israel
Patent Application No. 173409 filed on Jan. 29, 2006 and Israel
Patent Application No. 171577 filed on Oct. 26, 2005. The contents
of the above-referenced patent applications are all incorporated
herein by reference.
FIELD AND BACKGROUND
[0002] The present invention relates to real-time parallel
processing using so-called liquid architectures, and, more
particularly but not exclusively, to real-time processing and
classification of streaming noisy data using adaptive,
asynchronous, fault tolerant, robust, and parallel processors.
[0003] During the last decade, there has been a growing demand for
solutions to the computing problems of Turing-machine (TM)-based
computers, which are commonly used for interactive computing. One
suggested solution is a partial transition from interactive
computing to proactive computing. Proactive computers are needed,
inter alia, for providing fast computing of natural signals from
the real world, such as sound and image signals. Such fast
computing requires the real time processing of massive quantities
of asynchronous sources of information. The ability to analyze such
signals in real time may allow the implementation of various
applications, which are designed for tasks that currently can be
done only by humans. In proactive computers, billions of computing
devices may be directly connected to the physical world so that I/O
devices are no longer needed.
[0004] As proactive computers are designed to allow the execution
of day-to-day tasks in the physical world, an instrument that
constitutes the connection to the real world must be part of the
process, so that the computer systems will be exposed to, and
linked with, the natural environment. In order to allow such
linkages, the proactive computers have to be able to convert real
world signals into digital signals. Such conversions are needed for
performing various tasks which are based on analysis of real world
natural signals, for example, human speech recognition, image
processing, textual and image content recognition, such as optical
character recognition (OCR) and automatic target recognition (ATR),
and objective quality assessment of such natural signals.
[0005] Regular computing processes are usually based on TM
computers which are configured to compute deterministic input
signals. As commonly known, occurrences in the real world are
unpredictable and usually do not exhibit deterministic behavior.
Execution of tasks which are based on analysis of real world
signals have high computational complexity and, thus, analysis of
massive quantities of noisy data and complex structures and
relationships is needed. As the commonly used TM based computers
are not designed to handle such unpredictable input signals, in
affective manner, the computing process usually requires high
computational power and energy source power.
[0006] Gordon Moore's Law predicts exponential growth of the number
of transistors per integrated circuit. Such exponential growth is
needed in order to increase the computational power of signal chip
processor, however as the transistors become smaller and reduce the
effective length of the distance in the near-surface region of a
silicon substrate between edges of the drain and source regions in
the field effect transistor is reduced, and it becomes practically
impossible to synchronize the entire chip. The reduced length can
be problematic; as such a large number of transistors may be leaky,
noisy, and unreliable. Moreover, fabrication cost grows each year
as it becomes increasingly difficult to synchronize an entire chip
at multiple GHz clock rates and to perform design verification and
validation of a design having more than 100 million
transistors.
[0007] In the light of the above, it seems that TM-based computers
have a growth limit and, therefore, may not be the preferred
solution for analyzing real world natural signals. An example of a
pressing problem that requires analysis of real world signals is
speech recognition. Many problems have to be solved in order to
provide an efficient generic mechanism for speech recognition.
However, most of the problems are caused by the unpredictable
nature of the speech signals. For example, one problem is due to
the fact that different users have different voices and accents,
and, therefore, speech signals that represent the same words or
sentences have numerous different and unpredictable structures. In
addition, environmental conditions such as noise, channel
limitations, and may also have an effect on the performance of the
speech recognition.
[0008] Another example of pressing problem which is not easily
solved by TM-based computers is related to the field of string
matching and regular expressions identification. Fast string
matching and regular expression detection is necessary for a wide
range of applications, such as information retrieval, content
inspection, data processing and others. Most of the algorithms
available for string matching and regular expression identification
are endowed with high computational complexity and, therefore,
require many computational sources. A known solution to the problem
requires a large amount of memory for storing all the optional
strings and hardware architecture, as it is based on the
Finite-State-Machine (FSM) model, wherein the memory for each
execution of matching operations is sequentially accessed. Such a
solution requires, in turn, large memory arrays that constitute a
bottleneck that limits throughput, since the access to memory is a
time or clock cycle consuming operation. Therefore, it is clear
that a solution that allows the performance of string matching yet
can save on access to memory, and can substantially improve the
performance of the process.
[0009] During the last decade, a number of non-TM computational
solutions have been adopted to solve the problems of real world
signals analysis. A known computational architecture which has been
tested is neural network. A neural network is an interconnected
assembly of simple nonlinear processing elements, units or nodes,
whose functionality is loosely based on the animal brain. The
processing ability of the network is stored in the inter-unit
connection strengths, or weights, obtained by a process of
adaptation to, or learning from, a set of training patterns. Neural
nets are used in bioinformatics to map data and make predictions.
However, a pure hardware implementation of a neural network
utilizing existing technology is not simple. One of the
difficulties in creating true physical neural networks lies in the
highly complex manner in which a physical neural network must be
designed and constructed.
[0010] One solution, which has been proposed for solving the
difficulties in creating true physical neural networks, is known as
a liquid state machine (LSM). An example of an LSM is disclosed in
"Computational Models for Generic Cortical Microcircuits" by
Wolfgang Maass et al., of the Institute for Theoretical Computer
Science, Technische Universitaet Giaz, Graz, Austria, published on
Jan. 10, 2003. The LSM model of Maass et al. comprises three parts:
an input layer, a large randomly connected core which has the
intermediate states transformed from input, and an output layer,
liven a time series as input, the machine can produce a time series
as a reaction to the input. To get the desired reaction, the
weights on the links between the core and the output must be
adjusted.
[0011] U.S. Patent Application No. 2004/0153426, published on Aug.
5, 2004, discloses the implementation of a physical neural network
using a liquid state machine in nanotechnology. The physical neural
network is based on molecular connections located within a
dielectric solvent between presynaptic and postsynaptic electrodes
thereof, such that the molecular connections are strengthened or
weakened according to an application of an electric field or a
frequency thereof to provide physical neural network connections
thereof. A supervised learning mechanism is associated with the
liquid state machine, whereby connection strengths of the molecular
connections are determined by presynaptic and postsynaptic activity
respectively associated with the presynaptic and postsynaptic
electrodes, wherein the liquid state machine comprises a dynamic
fading memory mechanism.
[0012] Another type of network, very similar to the LSM, is known
as an echo state net (ESN) or an echo state machine (ESM), which
allows universal real-time computation without stable state or
attractors on continuous input streams. From an engineering point
of view, the ESN model seems nearly identical to the LSM model.
Both use the dynamics of recurrent neural networks for
preprocessing input and train extra mechanisms for obtaining
information from the dynamic states of these networks. An ESN based
neural network consists of a large fixed recurrent reservoir
network from which a desired output is obtained by training
suitable output connection weights. Although these systems and
methods present optional solutions to the aforementioned
computational problem, the solutions are complex and in any event
do not teach how the liquid state machine can be efficiently used
to solve some of the signal processing problems.
[0013] There is thus a widely recognized need for, and it would be
highly advantageous to have, a method and a system for processing
stochastic noisy natural signals in parallel computing devoid of
the above limitations.
SUMMARY
[0014] Certain embodiments disclosed herein include an apparatus
for processing a data stream. The apparatus comprises a processing
unit comprised of a plurality of computational cores, each
computational core is configured to receive an input data and
provide a unique output data, each computational core is randomly
programmed prior to receiving the input data to produce the unique
output data respective of the input data, wherein at least two of
the plurality of computational cores operate in parallel; an input
interface configured to receive the data stream and simultaneously
provide the received data stream to each of the inputs of the
plurality of computational cores; and an output interface configure
to simultaneously receive the output data from each of the
plurality of computational cores.
BRIEF DESCRIPTION OF THE DRAWINGS
[0015] The invention is herein described, by way of example only,
with reference to the accompanying drawings. With specific
reference now to the drawings in detail, it is stressed that the
particulars shown are by way of example and for purposes of
illustrative discussion of the preferred embodiments of the present
invention only, and are presented in order to provide what is
believed to be the most useful and readily understood description
of the principles and conceptual aspects of the invention. In this
regard, no attempt is made to show structural details of the
invention in more detail than is necessary for a fundamental
understanding of the invention, the description taken with the
drawings making apparent to those skilled in the art how the
several forms of the invention may be embodied in practice.
[0016] In the drawings:
[0017] FIG. 1 is a schematic illustration of a computational layer,
according to a preferred embodiment of the present invention;
[0018] FIG. 2 is a schematic illustration of an integrated circuit
that functions as a computational core in a computational layer,
according to a preferred embodiment of the present invention;
[0019] FIG. 3A is a schematic illustration of an integrated circuit
that functions as a leaky integrate-to-threshold unit, according to
a preferred embodiment of the present invention;
[0020] FIG. 3B is a graph depicting the charging current and the
threshold of the leaky integrate-to-threshold unit, according to a
preferred embodiment of the present invention;
[0021] FIG. 3C is another schematic illustration of an integrated
circuit that functions as a leaky integrate-to-threshold unit,
according to an embodiment of the present invention;
[0022] FIGS. 4A and 4B are schematic illustrations of an integrated
circuit that functions as a leaky integrate-to-threshold unit and
is implemented using very large scale integration (VLSI)
technology, according to a preferred embodiment of the present
invention;
[0023] FIG. 5 is a schematic illustration of a coupling node unit
(CNU), according to a preferred embodiment of the present
invention;
[0024] FIG. 6 is a set of two graphs which depict the dynamics of
the CNU, according to embodiments of the present invention;
[0025] FIGS. 7A, 7B, 7C and 7D are schematic illustrations of CNUs
that may be implemented using VLSI technology, according to
embodiments of the present invention;
[0026] FIG. 8 is a schematic illustration of the liquid section of
a computational core, according to a preferred embodiment of the
present invention;
[0027] FIG. 9A is a schematic illustration of the linker section of
a computational core, according to a preferred embodiment of the
present invention;
[0028] FIG. 9B is a schematic three dimensional illustration of a
computational core, according to a preferred embodiment of the
present invention;
[0029] FIG. 9C is a graphical representation of a digital
implementation of a liquid section, according to one preferred
embodiment of the present invention;
[0030] FIG. 10 is a schematic illustration of an electric circuit
that represents the computational core of FIG. 2 and an output
circuit, according to a preferred embodiment of the present
invention;
[0031] FIG. 11A is a block diagram that depicts the relationship
among electronic components which are related to the computational
layer, according to a preferred embodiment of the present
invention;
[0032] FIGS. 11B and 11C are exemplary computational layers, as
FIG. 11A, that receive two different external data streams,
according to one preferred embodiment of the present invention.
[0033] FIG. 12 is a schematic illustration that depicts the
connections between an exemplary computational core and the
computational layer, according to a preferred embodiment of the
present invention;
[0034] FIG. 13 is a schematic illustration of a proactive computer
which is based on a number of sequential computational layers,
according to a preferred embodiment of the present invention;
[0035] FIGS. 14A and 14B are schematic representations of a
computational layer, as shown in FIG. 11A, which is connected to
three encoders and to a single encoder, respectively, according to
embodiments of the present invention;
[0036] FIGS. 15A and 15B are schematic representations of the
implementation of hard-coded division and dynamic division,
respectively, of an external data stream, according to embodiments
of the present invention;
[0037] FIG. 16A is a schematic representation of two connected
computational layers, according to a preferred embodiment of the
present invention;
[0038] FIG. 16B is a graphical illustration of the communication
between two computational layers during a certain period, according
to a preferred embodiment of the present invention;
[0039] FIGS. 17A and 17B are graphical representations of
sequential computational layers and the connections between them,
according to a preferred embodiment of the present invention;
[0040] FIG. 18 is a schematic representation of the computational
core of FIG. 2 and a connection thereof to a resource allocation
control unit, according to a preferred embodiment of the present
invention;
[0041] FIG. 19 is a schematic representation of a computational
layer that is connected to a single encoder, as shown in FIG. 14B,
according to an embodiment of the present invention;
[0042] FIG. 20 is a graphical representation of a three dimensional
space representing the outputs of a computational core, according
to an embodiment of the present invention;
[0043] FIG. 21 is a table of reporting units in a computational
layer with twelve computational cores, according to an embodiment
of the present invention;
[0044] FIG. 22 is a graphical representation of a two dimensional
space representing the outputs of a computational core, according
to an embodiment of the present invention;
[0045] FIG. 23 is a set of graphs at an example which depict the
outputs of different computational cores in a two dimensional
space, according to an embodiment of the present invention;
[0046] FIG. 24 is a graphical representation of two different
subspaces and a conjugated subspace used to identify a certain
signal during the operational mode, according to an embodiment of
the present invention;
[0047] FIG. 25A is a table representing the outputs of a
computational layer with twelve cores for different patterns form
the same class, according to an embodiment of the present
invention;
[0048] FIG. 25B is a schematic representation a computational
layer, according to a preferred embodiment of the present
invention;
[0049] FIG. 25C is an exemplary memory array, according to a
preferred embodiment of the present invention;
[0050] FIG. 26 is a schematic representation of the computational
core of FIG. 2, further comprising an encoder, according to a
preferred embodiment of the present invention;
[0051] FIG. 27 is a schematic representation of a computational
layer, according to another embodiment of the present
invention;
[0052] FIG. 28 is a schematic representation of the separation of
the received data stream into parts based on a predefined table,
according to a preferred embodiment of the present invention;
[0053] FIG. 29A is a schematic representation of a computational
core having a direct connection between computational processors of
the liquid section and memory components of the linker section,
according to a preferred embodiment of the present invention;
[0054] FIG. 29B, which is a computational core, as depicted in FIG.
9B, in a learning mode, according to a preferred embodiment of the
present invention;
[0055] FIG. 29C, which is a computational core, as depicted in FIG.
9B, in an operational mode, according to a preferred embodiment of
the present invention;
[0056] FIG. 30 is a graph for describing the response probability
of different LTUs to a certain string, according to a preferred
embodiment of the present invention;
[0057] FIG. 31 is a graphical representation of a computational
layer, according to another embodiment of the present invention;
and
[0058] FIG. 32 is a simplified flowchart diagram of a method for
processing a data stream using a number of computational cores,
according to a preferred embodiment of the present invention;
[0059] FIG. 33 is a graphical representation of a diagram of a
computational layer, as depicted in FIG. 11A, which further
comprises a number of voting components, input preprocessing
components, and a signature selector, according to one embodiment
of the present invention; and
[0060] FIG. 34 is a graphical representation of a diagram of a
computational layer, as depicted in FIG. 11A, in which the
computational cores are divided to several subgroups, each receives
inputs from another source, according to one embodiment of the
present invention.
DETAILED DESCRIPTION
[0061] The present embodiments comprise an apparatus, a system and
a method for parallel computing by simultaneously using a number of
computational cores. The apparatus, system and method may be used
to construct an efficient proactive computing device with
configurable computational cores. Each core comprises a liquid
section, and is preprogrammed independently of the other cores with
a function. The function is typically random, and the core retains
the preprogrammed function although other aspects of the core can
be reprogrammed dynamically. Preferably, a Gaussian or like
statistical distribution is used to generate the functions, so that
each core has a function that is independent of the other
cores.
[0062] The apparatus, system and method of the present invention
are thus endowed with computational and structural advantages
characteristic of biological systems. The embodiments of the
present invention provide an adaptively-reconfigurable parallel
processor having a very large number of computational units. The
processor can be dynamically restructured using relatively simple
programming
[0063] The principles and operation of an apparatus, system and
method according to the disclosed embodiments may be better
understood with reference to the drawings and accompanying
description.
[0064] Before explaining at least one embodiment of the invention
in detail, it is to be understood that the invention is not limited
in its application to the details of construction and the
arrangement of the components set forth in the following
description or illustrated in the drawings. The invention is
capable of other embodiments or of being practiced or carried out
in various ways. Also, it is to be understood that the phraseology
and terminology employed herein is for the purpose of description
and should not be regarded as limiting.
[0065] According to one aspect of the present invention there is
provided an apparatus, a system and a method for asynchronous,
adaptive and parallel processing of data streams using a computing
device with a number of computational cores. The disclosed
apparatus, system and method can be advantageously used in
high-speed, fault-tolerant, asynchronous signal processing.
Preferably, the computing device may be used in a new computational
model for proactive computing of natural ambiguous and noisy data
or data which is captured under severe Signal to Noise Ratio
(S-N-R).
[0066] As further described below, all or some of the computational
cores of the computing device receive the same data stream which
they simultaneously process. The computational units execute
sub-task components of a computational task in parallel. It should
be noted that the computing device can also execute multiple tasks
in-parallel.
[0067] Coordination among computational cores may be based on the
principle of winner-takes-all, voting using majority voting,
statistical analysis, etc. The computing device may produce a
unique output according to a required task.
[0068] One of the main advantages of the present invention is in
its computational power. The computational power of the
computational layers or the system as a whole lies is in its
multi-parallelism and huge space of possible solutions to a given
task. This is radically different from the principles of design and
operation of conventional TM based processors. Both the computing
device as a whole and configurations of the computational cores may
be adaptively reconfigured during the operation. It should be noted
that the computing device may be implemented using very large scale
integration (VLSI) technology. The system of the present invention
is fault tolerant and such an implementation endows the VLSI with
new degrees of freedom that can increase the VLSI production yields
because of the improved fault tolerance.
[0069] The system, the apparatus, and the method of the present
invention may be used for performing tasks that currently consume
high computational power such as fast string matching, image signal
identification, speech recognition, medical signals, video signals,
data categorizing, physiological signals, data classification, text
recognition, and regular expression identification. Using the
present embodiments, these tasks can be efficiently accomplished,
as a large number of functional computational units or cores are
used in parallel to execute every step in the computational
process. The data stream is transmitted to the relevant
computational cores simultaneously.
[0070] In one embodiment of the present invention, the
computational core itself is constructed from two sections, a
liquid section and a linker section, as will be explained in
greater detail hereinbelow.
[0071] In use, each computational core is associated with a
specific subset of signals from the external world, and produces a
unique output such as a clique of elements or a binary vector based
thereupon. Such a unique output may be mapped by the linker section
to the actual output of the computational core. Preferably, the
linker section is programmed to map a certain subset of cliques to
the core's actual output, according to the required task.
[0072] One of the factors that support the efficiency of the
computing device is that the output depends only on the state of
the liquid part of the core that the input brings about. There is
no use of memory and therefore no access is made to storage
devices. Thus, the throughput of the computing device is affected
only by the propagation time of the signal in the computational
cores. The computational cores themselves are preferably
implemented using fast integrated circuits, as further described
below and operational delay depends only on the signal propagation
time through the core. Thus, the computing device provides an
efficient solution for many computing problems that usually require
frequent access to the memory, such as fast string matching and
regular expression identification.
[0073] Reference is now made to FIG. 1, which is a schematic
illustration of a computing device comprising a computational layer
1, which processes an external data stream 5 according to a
preferred embodiment of the present invention. An external data
stream may be understood as signals or streams of signals or data
from the external world, such as image or sound or video streams;
analog or digital signals, such as signals that represent a
predefined string or a regular expression; sensor output signals;
database records; CPU outputs; naturally structured signals such as
locally-correlated dynamical signals of speech and image etc.
[0074] As depicted in FIG. 1, the computational layer 1 comprises
an input interface 61, which is designed for receiving the external
data stream 5. The input interface 61 is directly connected to a
number of different computational cores 100. As the connection is
direct, the input interface has the ability to simultaneously
transfer the external data stream 5 to each one of the
computational cores 100. Each one of the computational cores 100 is
randomly programmed, preferably using a statistical function, and
thus each computational core produces a unique output for a given
input. Preferably, the computational core comprises a liquid with a
unique function and configuration which is designed to produce a
unique output for an input of interest. This unique output is
referred to as a state or liquid state.
[0075] It should be noted that since each one of the computational
cores 100 is randomly programmed over a statistical distribution, a
better coverage of the distribution is received when more
computational cores 100 are used as a greater diversity of the
processing patterns is received. Therefore, a large number of
computational cores 100 ensure that the external data stream is
processed according to a large number of diverse patterns. As
described below, such diversity increases the probability that a
certain external data stream will be identified by the
computational layer 1. All the outputs are transferred to an output
interface 64, which is directly connected to each one of the
computational cores 100. The output interface 64 is configured to
receive the outputs and, preferably, to forward them to a central
computing unit (not shown).
[0076] Such an embodiment can be extremely useful for
classification tasks which are performed in many common processes,
such as clustering, indexing, routing, string matching, recognition
tasks, verification tasks, tagging, outliner detection etc. Each
one of the numerous computational cores is designed to receive and
classify, at the same time with other computational core of the
computational layer 1, the external data stream. As further
described below, the classification is based on predefined set of
possible signals which have been introduced to the computational
core 100 beforehand.
[0077] In order to describe the computational layer 1 more fully,
with additional reference to FIG. 2, FIGS. 3A and 3B, FIG. 8 and
others, the structure and function of the computational cores 100
will be further described. The computational cores 100 each have a
unique processing pattern, which defined at a section which may be
referred as the liquid section 46.
[0078] Reference is now made to FIG. 2, which is a schematic
illustration of a computational core 100 for processing one or more
data streams, in accordance with one embodiment of the invention.
FIG. 2 depicts an integrated circuit that is divided into a liquid
section 46 and a linker section 47 which is designed to produces
the overall core output in a vector or binary value, as described
below.
[0079] As depicted in FIG. 2, the computational core 100 further
comprises a set of flags 50, which are used to indicate, inter
alia, the current operation mode of the computational core 100 and
the outcome of the processing of the received data stream, as
further described below. The computational core 100 further
comprises a set of input pins 49 for receiving input signals, such
as a digital stream, and a set of output pins 48 for, inter alia,
forwarding the received input signals.
[0080] The liquid section 46 comprises an analog circuit that
receives temporal segments of binary streaming data {right arrow
over (S)}(|t<t.sub.s|), made up of two constant voltage levels
V.sub.high and V.sub.low that respectively represent the binary
values 0 and 1. It should be noted that the input may not be
binary, for example in the digital implementation.
[0081] The liquid section 46 is designed to capture and preferably
forward a unique pattern of the received external data stream.
Preferably, the external data stream is encoded in the temporal
segments of streaming binary data. The external data stream may be
understood as a stream of digital signals, a stream of analog
signals, a stream of voice signals, a stream of image signals, a
stream of real-world signals, etc. The external data stream is
preferably encoded as a binary vector having a finite length that
comprises several discrete values.
[0082] The task of the liquid section 46 is to capture one
intrinsic dimension (property) of the external environment.
Properties are encoded in temporal segments of input, and drive the
liquid section 46 to a unique state.
[0083] The captured properties are represented in the liquid
section 46 by liquid states (LS). LS is a vector with a finite
length of several discrete values. Such an embodiment allows
identifications to be made from noisy data as will be explained
below. Each liquid-state captures a unique property of the
presented scenario or event. The representation may be context
dependent and thus affords context aware operation at the lower
levels of the processing scheme. These abilities enable the
computational layer to provide efficient interfacing with the
physical world.
[0084] The liquid section 46 in effect comprises a finite memory,
in terms of temporal length of the input. For efficient computing
in such an embodiment, temporal segments {right arrow over (S)},
which are received by the liquid section 46, are set to this finite
length|t<t.sub.s|=T, preferably by means of the input encoder to
be discussed below.
[0085] The received external data stream drives the liquid section
46 to a unique state which is associated with a record or a
register that indicates that the received external data stream has
been identified.
[0086] In one embodiment, the liquid section 46 of the
computational core is comprised of basic units of two types. One
unit is preferably a leaky integrate-to-threshold unit (LTU) and
the other type is preferably a coupling node unit (CNU) which is
used for connecting two LTUs. The CNUs are distributed over the
liquid section 46 in a manner that defines a certain unique
processing pattern. The CNU connections can be changed dynamically,
as will be described in greater detail below.
[0087] Reference in now made to FIG. 3A, which is an exemplary LTU
500 that is implemented using an electric circuit. The LTU 500
preferably comprises an input 52, connected 56 to a resistance 51,
a capacitance 55, a measuring module 53, and an output 54. The
exemplary LTU electric circuit 500 is constructed according to the
following mathematical model:
RC(dV/dt)=-(V-V.sub.ref)+R(I.sub.CN(t)) (1)
where R denotes the input resistance, as shown at 51, C denotes the
capacitance, as shown at 55, V.sub.ref denotes the reference
potential of the electric circuit 500, V denotes voltage at the
measuring point of the electric circuit 500, and 5 I.sub.CN denotes
the input current which is received from the CN (coupling
node).
[0088] If V exceeds a certain threshold voltage 57, it is reset to
the V.sub.ref and held there during the dead time period T.sub.d.
The RC circuit is used for model charging of the LTU from its
resting potential to t.sub.thresh. Then, the current is measured by
a measuring module 53 which is designed to generate a current flow
output only if supra-threshold spikes of the measured charging
current are produced in the output 54, as shown in FIG. 3B. FIG. 3C
is an additional schematic illustration of the LTU 500. It should
be noted that LTUs might also be implemented in a VLSI, as depicted
in FIGS. 4A and 4B.
[0089] Reference is now made to FIG. 5, which is a schematic
diagram of an exemplary model of a CNU 600, according to a
preferred embodiment of the present invention. As described above,
the CNU 600 is a dynamic connector between two LTUs. The CNU 600 is
designed to act as a weighted connection, preferably with a
variable dynamic weight, marked with the symbol .SIGMA., which is
influenced by the input frequency history. As depicted in FIG. 6,
which depicts the dynamics of the connection, the connection weight
may be increased, as shown at 55, or decreased, as shown at 54,
depending on the input. A mathematical model of the CNU's variable
weight is:
I.sub.CN(t)=.SIGMA.CNC.sub.i(t) (2)
CNC=Ae.sup.-t/.tau..sup.CN (3)
where I.sub.CN(t) denotes the coupling node current, as shown at
67, CNC denotes an input coupling node current, as shown at 68, A
denotes a positive or a negative dynamic coefficient of the CNU
600, and .tau..sub.CN denotes the decay time constant of the CNU
600.
[0090] It should be noted that the CNU 600 might also be
implemented in VLSI architecture, as shown in FIGS. 7A, 7B, 7C, and
7D, which are diagrams showing three possible CNUs. One VLSI
implementation, as shown at 75 of FIG. 7A and in FIG. 7B, is a
static CNU where the CNC is constant. Another VLSI implementation
is a CNU with negative dynamics, which is shown at 76 of FIG. 7A
and in FIG. 7C. FIG. 7D depicts an implementation of a CNU with
positive dynamics, as shown at 77 of FIG. 7A. Each one of the CNUs
may be weighted and decay in time in a different manner. As
described above, liquid sections of different computational cores
100 may be randomly programmed, preferably according to a
statistical function, in order to create separate computational
cores with a diversity of patterns. In an embodiment the weighting
and decay time of the CNU is initially set using a statistical
distribution function. Preferably, the weighting and decay time of
the CNUs of all the liquid sections of the computational cores is
randomly set. In such a manner, it is ensured that a diversity of
patterns is given to the computational cores.
[0091] It should be noted that the given description of the CNU and
the LTU is only one possible implementation of these components.
The CNU and the LTU may be implemented using any software or
hardware modules or components and different features may be
provided as programmable parameters. Moreover, simpler
implementation of the CNU, such as a CNU with a constant CNC and
simpler implementation of the LTU, such as an LTU without T(d) may
also be used.
[0092] Reference is now made to FIG. 8, which is a graphical
representation of an exemplary liquid section 46, according to one
embodiment of the present invention. The liquid section 46
comprises a grid of LTUs, as shown at 702, which are connected by
one or more CNUs, as shown at 701. The CNUs are randomly applied,
as described above. In the exemplary liquid section 46 that is
depicted in FIG. 8, approximately 1000 CNUs are applied to randomly
connect a grid of .about.100 LTUs. It should be noted that the
liquid section may be implemented using any software or hardware
module.
[0093] In one embodiment of the present invention, the CNUs are
applied according to a variable probability function that is used
to estimate the probability that a CNU connects a pair of LTUs.
Preferably, the probability of a CNU being present between two LTUs
depends on the distance between the two LTUs, as denoted the
following equation:
Cexp(-D(i,j)/.lamda..sup.2) (4)
where .lamda. and C denote variable parameters, preferably having
the same or different average value in all the computational cores,
and D denotes a certain Euclidean distance between LTU i and LTU j.
In order to ensure a large degree of freedom and heterogeneity
between different computational cores that comprise the
computational layer, each liquid section 46 has random,
heterogeneous .lamda. and C parameters that determine the average
number of CNUs according to the .lamda. and C distribution. It
should be noted that other algorithms may be used as random number
generators in order to determine the distribution of CNUs between
the LTUs. When a certain external data stream is received by the
liquid section 46, it is forwarded via the CNUs to the different
LTUs. The received external data stream may or may not trigger the
liquid section 46, causing it to enter a state and generate an
output to the linker section 47. The generation of the output
depends on the distribution of the CNUs over the liquid section 46.
Preferably, a certain binary vector or any other unique signature
is generated as a reaction to the reception of an external data
stream. This embodiment ensures that the liquid section 46
generates different outputs as a response to the reception of
different signals. For each signal a different output, that is
referred to as a state may be entered.
[0094] The liquid section 46 may be defined to receive two
dimensional data such as a binary matrix. In such an embodiment the
liquid section 46 is sensitive to the spatiotemporal structure of
the streaming data. An example for such data input is depicted in
FIG. 9B that depicts a two dimensional input 250 which is injected
into a the liquid section and a set of LTUs 251 that is responsive
to the present input at time and dynamic processes. The set of LTUs
251 constitute a unique state which later can be associated with
the received input, as described below in relation to the learning
mode.
[0095] Reference is now made to FIG. 9C, which is a graphical
representation of a digital implementation of the liquid section
1500, according to one embodiment of the present invention. FIG. 9B
depicts an exemplary implementation of one LTU 1502 and a network
buffer 1500. In this embodiment, simpler components 1502 are used
to implement the liquid section 1506.
[0096] FIG. 9B only depicts one exemplary LTU 1502 which is
attached to a subtraction element 1504. Other LTUs are not depicted
in the figure only for simplicity and clarity of the description.
The LTU 1502 is configured according to Mux-Adder logic. The LTU
1502 is designed to receive values to its counter from a set of
other LTUs by a set of connections W.sub.1, W.sub.2, W.sub.3 and
W.sub.4. The connectivity of each connection is randomly generated,
with parameters defined according to the distribution based on
analysis of input signals. For example, only 10 percent of the
possible connections between different pairs of LTUs are connected,
wherein 10 percent of them function as inhibitory neurons. The
network is fed by a temporal input, which is denoted
(K.sub.{in}(t)), which is injected into selected set of input LTUs.
An exemplary input is depicted in FIG. 9C, as shown at 1503. As a
set of inputs may be injected to the input LTUs, the processing of
two dimensional inputs such as a binary matrix that represent an
image can be processed The output counter value of the LTU 1502 is
injected to a neighboring LTU N(t+1) and to a subtracting element
1504. The subtracting element 1501 substrates the leakage counter
value 1505 from the received counter value and inject it back to
the network buffer K.sub.5(t+1).
[0097] The distribution of the connections is determined by
different distributions schemes, such as flat, discrete flat and
Gaussian distributions. The counter value is forwarded in a network
according to the following equations of motion:
{ n i ( t + 1 ) = [ 1 - K i ( t ) ] n i ( t ) + j W ij K j - I K i
( t ) = .theta. ( th - n i ( t ) ) ##EQU00001##
Where
[0098] n.sub.i denotes the counter value of the LTU, K.sub.i
denotes is a binary spiking indicator of the LTU, W.sub.ij is a
value that indicates the weight between LTU .theta.(x) denotes is a
Heaviside step function and I denotes the leaking
[0099] The Heaviside step function, which is also sometimes denoted
H(x) or u(x) is a discontinuous function which is also known as the
"unit step function" and defined by:
0 X.ltoreq.Threshold
1 Threshold.ltoreq.X
[0100] The output of the network is collected during or after the
processing of the inputs from a set of output neurons, which is
denoted {out}
[0101] Reference is now made, once again, to FIG. 2. The linker
section 47 is associated with the liquid section 46. The linker
section 47 is designed to capture the state of the liquid section
46 and to produce a core output accordingly. Preferably, the linker
section 47 is designed to generate one or more binary vectors from
the liquid state when respective external data streams are
identified thereby. A more elaborate example of such an embodiment
is described below in relation to FIG. 27.
[0102] The linker section 47 is designed to produce a core output
according to the state of the liquid section, preferably as a
reaction to the reception of such a binary vector. Preferably, the
linker section 47 maps the binary vector onto an output, as defined
in the following equation:
output=linker (state).
[0103] The output may also be understood as a binary value, a
vector, a clique of processors from the unique processing pattern,
a digital stream or an analog stream. The concept of the clique is
described hereinbelow.
[0104] The linker section 47 may be implemented by a basic circuit,
and transforms binary vectors or any other representations of the
state of the liquid section 46 into a single binary value using a
constant Boolean function. Consequently, the computational core is
able to produce an output which is a single binary value. More
sophisticated circuits that allow the conversion of the received
binary vector to a digital value, more precisely representing the
processed external data stream may also implemented in the linker
section 47. The linker section 47 may alternatively or additionally
incorporate an analog circuit that operates over a predetermined
time window and eventually leads to a digital output that is
representative of behavior in the computational core over the
duration of the window.
[0105] Reference is now made to FIG. 9A, which is a schematic
illustration of the linker section 47, according to a preferred
embodiment of the present invention. The linker section 47 may
comprise a number of registers, as shown at 96, which are
configured to store a number values, such as binary vectors, that
may be matched with the outputs of the liquid section. The linker
section 47 further comprises a linking function unit 200. The
linking function unit 200 is designed to match between the received
vectors and values which are stored in the registers of the linker
section 47.
[0106] Reference is now made, once again, to FIG. 2. The
computational core 100 is designed to operate in separate learning
and operational modes. The learning mode may be referred to as
melting, in that new states are melted into the liquid, and then
the liquid is frozen for the operation state, and thus the
operational mode is regarded as a frozen state. When a new external
data stream is presented to the computational core 100, the
learning mode is activated. The learning process, which is
implemented during the learning mode, ensures that the
computational core 100 is not limited to a fixed number of external
data streams and that new limits can be dynamically set according
to one or more new external data streams which are introduced to
the computational core 100.
[0107] During the learning process, the reception of a new external
data stream may trigger the liquid section 46 to output a binary
vector to the linker section 47. The generation of a binary vector
depends on the distribution of CNUs over the liquid section 46, as
described above. When a binary vector is output, the liquid section
switches to operational mode. The binary vector is output to the
linker section 47 that stores the received binary vector,
preferably in a designated register, and then switches to
operational mode. An exemplary register is shown at 96 of FIG. 9A.
When the linker section 47 is in operational mode, all the outputs,
which are received from the liquid section 46, are matched to the
binary vectors which are preferably stored in the registers of the
liquid section.
[0108] The learning mode provides the computational layer with the
ability to learn and adapt to the varying environment. Breaking the
environment into external data streams that represent
context-dependent properties allows learning and adaptation at both
the level of a single computational unit and at the global level of
an architecture incorporating a large number of processing units.
The learning mode provides the computational layer with a high
dimensional ability of learning and adaptation which is reflected
by the inherent flexibility of the computational layer to be
adjusted according to new signals.
[0109] Such a learning process may be used to teach the
computational layer to perform human-supervised operations.
Performing such operations takes the user out of the loop as long
as possible, until it is required to provide guidance in critical
decisions. Thus the role of the human is significantly reduced.
[0110] During the operational mode, the computational core 100
receives the external data streams. The liquid section 46 processes
the external data streams and, based thereupon, outputs a binary
vector to the linker section 47. The linker section 47 compares the
received binary vector with a number of binary vectors which
preferably were stored or frozen into its registers during the
learning mode, as described above. The linker section 47 may be
used to output either a binary value, a vector representing the
output of the liquid section, as explained below in relation to
FIG. 27, or a value which is associated with a certain possible
output of the liquid section. The linking function unit of the
linker section 47 preferably outputs a certain current that
indicates whether a match has been found to the received input.
Preferably, the linking function unit updates a flag that indicates
that an external data stream has been identified. As further
described below, the core outputs are injected into central
processing units that analyze all the outputs of the different
cores and generate an output based thereupon.
[0111] Reference is now made to FIG. 10, which is a schematic
illustration of the computational core 100 that is depicted in FIG.
2, and an additional output circuit 400. The additional output
circuit 400 comprises a comparator 401, an AND gate 402, and an
external bus interface 403. The output circuit 400 is connected to
a controller 50 which comprises registers or 1, 2, 3, and 4 and
which is updated according to the mode of the computational core
100 and related inputs and outputs. In the exemplary set which is
depicted in FIG. 10, the value of register 1 is determined
according to the input bus bits and the value of register 2 is
determined according to the output bus bits. The value of register
3 reflects the current operation mode of the computational core.
The value of register 4 is the outcome of a winner-takes-all
algorithm, which is used to indicate whether or not the
computational core 100 identifies the input external data stream,
as further described below.
[0112] The outputs of the linker section 47 are transmitted via
gates 401 and 402 to the external bus interface 403 when a flag in
the controller 50 is set to indicate that a predefined input is
recognized. The external bus interface 403 outputs the received
transmission via output pins 48.
[0113] As described above, all the computational cores are
preferably embedded in one electric circuit that constitutes a
common logical layer. The computational cores receive,
substantially simultaneously, signals originating from a common
source. Each one of the computational cores separately processes
the received signals and, via the output of the linker section 47,
outputs a binary value. Preferably, all the outputs are transferred
to a common match point, as described below.
[0114] The term "simultaneously" and "substantially simultaneously"
may be understood as "at the same time" and "simultaneously in
phase". The term "at the same time may be taken as within a small
number of processor clock cycles, and preferably within two clock
cycles."
[0115] Reference is now made to FIG. 11A, which is a block diagram
of the structure of an exemplary computational layer 1 of a
proactive computational unit, according to one embodiment of the
present invention. The exemplary computational layer 1 comprises
twelve computational cores 100, connected to a bus (not shown), an
input 61, and an output 64. It should be noted that FIG. 11A is an
exemplary diagram only and that any number of parallel-operating
computational cores 100 which are connected by a bus can be
considered as a computational layer 1. In use, arrays of thousands
of computational cores may be used by the computational layer 1.
The small number of computational cores which is used in FIG. 11A
and in other figures has been chosen only for simplicity and
clarity of the description.
[0116] As described above, each one of the computational cores 100
are designed 15 simultaneously to receive an external data stream
and to output, based thereupon, a discrete value. The discrete
value stands for a certain signal which has been introduced to the
computational core beforehand and a signature has been stored in
memory in connection with the discrete value based thereupon. In
one embodiment of the present invention, the computational layer 1
is used for classifying external data stream.
[0117] As described above, during the learning mode, a number of
external data streams are injected to each one of the computational
cores 100. Each computational core receives the external data
stream and injects it to the liquid section. The liquid section
output produces a unique output based on the received external
data. The unique output is preferably stored in connection with a
discrete value. A number of different external data streams or
classes are preferably injected to each computational core that
stores a number of respective unique outputs, preferably in
connection with a respective number of different discrete numbers.
Now, during the operational mode, after a set of unique outputs
have been associated with a set of discrete values, the
computational cores 100 can be used for parallel classification of
external data streams which are received via the input 61. Such
classification can be used in various tasks such as indexing,
routing, string matching, recognition tasks, verification tasks,
tagging, outliner detection etc.
[0118] The discrete values are forwarded, via a common bus, to the
common output 64, which is preferably connected to a central
processing unit (not shown). The central processing unit
concentrates all the discrete values which are received from the
computational cores 100 and outputs a more robust classification of
the received external data stream. For example, as depicted in FIG.
11B and FIG. 11C which are exemplary computational layers, as for
FIG. 11A, that the layer receives two different external data
streams 1113 1114 which are identified by different sets of
computational cores 100. FIG. 11A depicts a set of computational
cores 1111 that identifies a certain pattern X in the external data
stream 1113, and generates core outputs based thereupon. FIG. 11B
depicts another set of computational cores 1112 that identifies a
certain pattern Y in the external data stream 1114, and generates
other core outputs based thereupon.
[0119] As described below in relation to FIG. 31, the core outputs
are forwarded to a central processing unit that uses one or more
voting algorithms, such as a majority voting algorithm, for
analyzing the outputs of the cores. The voting algorithms may be
based on the Condorcet's jury theorem. The theorem states that
where the average chance of a member of a voting group making a
correct decision is greater than fifty percent, the chance of the
group as a whole making the correct decision will increase with the
addition of more members to the group. As the average chance of
each one of the computational cores 100 to classify the received
external data stream is greater than fifty percent and the central
computational core receives the discrete values of a number of
computational cores, the central computational core has better
chances to accurately classify the received external data stream.
It should be noted that the chances to accurately classify the
received external data increase with the addition of more
computational cores 100 to the computational layer 1.
[0120] In one preferred embodiment of the present invention, the
computational cores 100 are divided into a number of subgroups,
which are assigned to a respective number of tasks. In such an
embodiment, each subgroup is programmed during the learning mode,
as described above, to identify one or more patterns in the
external data streams. For example, one subgroup of computational
cores may be assigned to process voice signals, while another is
assigned to process video signals of another. In such an
embodiment, the outputs of one subgroup may be connected, via
output 64, to one central processing unit, while another subgroup
may be connected to another central processing unit.
[0121] In one embodiment, as depicted in FIG. 34, which is a
computational layer as depicted in FIG. 11A above, the
computational layer 1 may be designed to process external data
streams 550 551 552 obtained from many heterogeneous sensors 553
554 555, on many platforms. In such an embodiment, which may be
used for data fusion applications, different subgroups of
computational cores 556 557 558 are assigned to process external
data streams which are originated from different sensors or
platforms. In such an embodiment, external data streams which are
received substantially simultaneously from different sensors such
as sound and image sensors are processed in parallel by different
subgroups of computational cores. Such an embodiment can be
beneficial in speech recognition as the voice of the speaker and
the motion of his lips can be analyzed in parallel.
[0122] It should be noted that the computational layer 1 may also
be implemented as a software module which can be installed on
various platforms, such as standard operating system like Linux,
real time platforms such as VxWorks, and platforms for mobile
device applications such as cell phones platforms, PDAs platforms,
etc.
[0123] Such an implementation can be used to reduce the memory
requirements of particular applications and enable novel
applications. For example, for implementing a recognition task, a
software module with only 100 modules that emulate computational
cores is needed. In such an embodiment, each emulated computational
core comprises 100 counters, which are defined to function as the
aforementioned LTUs. The counters have to be connected or
associated. Each core can be represented as an array of simple type
values and the nodes can be implemented as counters with compare
and addition operations.
[0124] Reference is now made to FIG. 12, which is a schematic
representation of a computational core 100 and a computational
layer 1, according to a preferred embodiment of the present
invention. Although only one computational core 100 is depicted, a
large number of computational cores 100 may similarly be connected
to the computational layer 1. While the computational core 100 and
computational layer 1 are as depicted in FIG. 11A, FIG. 12 further
depicts the connections between the outputs and inputs of the
exemplary computational core 100 and the inputs and outputs of the
exemplary computational layer 1. It should be noted that the
depicted computational core 100 is one of a number of computational
cores which are embedded into the computational layer 1 but, for
the sake of clarity, are not depicted in FIG. 12.
[0125] FIG. 12 further depicts a resource allocation control (RAC)
unit 26 that is preferably connected to each one of the
computational cores of the computational layer 1. Each one of the
computational cores 100 in the computational layer 1 is connected
to a number of input and output connections. Input signals, which
are received by the computational layer 1, are transferred to each
one of the computational cores via a set of external input pins 61,
through an external input buffer 62. Input signals may also be
transferred to each one of the computational cores via a layer N-1
input buffer 66. When the computational layer 1 is one of a number
of sequentially connected computational layers, the layer N-1 input
buffer 66 is used to receive core outputs from another
computational layer.
[0126] Core outputs from the computational cores 100 are received
at a set of external output pins 64. The core outputs are
transferred via an external output buffer 63. Preferably, if the
core outputs have to be further processed, the outputs of the
computational cores 100 may be sent to another computational layer,
via a layer N+1 output buffer 65, as described below in relation to
FIG. 13.
[0127] Reference is now made, once again, to FIG. 11A. As shown in
the figure, each one of the computational cores 100 is preferably
an autonomous unit that is connected separately to the inputs and
outputs of the computational layer 1. That is to say, computational
cores 100 belonging to the same layer are autonomous and do not
require cross-core communication.
[0128] As no cross-core communication is required, segmentation of
the external data-stream into properties (intrinsic dimensions) is
simplified. For example, in the case that the external data stream
is an audio waveform, the external data stream is segmented into
sub-inputs and preprocessed by encoders for providing to the
computational cores 100 with the desired input format.
Alternatively, the external data stream may be first preprocessed
by the encoder and then sub-divided into the computational elements
or not sub divided at all. Thus, each property is represented by a
temporal input with finite dimension and duration. The dimension is
determined by the number of external pins of the input, as further
described below, and the duration is determined and constrained by
memory capacity of each one of the computational cores 100. The
computational layer 1 and each one of the computational cores 100
are adaptively reconfigurable in time. The configuration at the
computational cores 100 level is manifested by allocation of
available cores for a specific sub-instruction, as described below
in relation to FIG. 17B, while the other sub-instruction may be
executed with different configuration of the computational cores.
At the computational layer level, the reconfiguration is a dynamic
allocation of numbers of layers and its connectivity to other
layers, as described in relation to FIG. 17A. It should be noted
that all the cores may process the same data without dynamic
allocation.
[0129] Reference is now made to FIGS. 14A and 14B, which are
graphical representations of a computational layer 1, similar to
that shown at FIG. 11A, which is connected to a single encoder 15
(FIG. 14B), according to one embodiment of the present invention
and to a number of different encoders 9, 10, and 11 (FIG. 14A),
according to another embodiment of the present invention. This
embodiment may be used as a solution for any signal processing
problem, such as signal recognition or classification. The external
data stream 5 is preprocessed by the encoder 15, to transform the
signal into a desired format. Different kinds of signals may be
preprocessed by different signal-type-dependent encoders. In FIG.
14A, the external data-stream 5 is segmented into sub inputs 6, 7,
8 which are respectively preprocessed by a number of different
encoders 9, 10, 11 into different digital streams 12, 13, 14.
Preferably, each one of the encoders 9, 10, 11 is designed to
encode the sub input it receives according to an encoding scheme
which might be different from the encoding schemes of the other
encoders. Preferably, as shown in FIG. 14B, the external data
stream 5 is divided into the digital streams only after the single
encoder 15 has preprocessed it.
[0130] As depicted in FIGS. 14A and 14B, each one of the digital
streams 12, 13, 14 constitutes a temporal input with a finite
dimension and duration. The number of the external input pins 61 of
the computational layer 1 determines the finite dimension of the
temporal input. The memory capacity of the computational cores
determines the duration to which the temporal input is limited.
[0131] As described above, the digital streams 12, 13, 14 are
transmitted through the external input pins 61 of the computational
layer 1 to all the connected computational cores 100. Preferably,
the external data stream 5 is continuous in time and is not broken
into data packets. It should be noted that different computational
cores 100, which receive different digital streams 12, 13, 14, may
asynchronously generate core outputs.
[0132] The external data streams 5, which are preferably based on
signals from the real world such as sound and image waveforms, are
usually received in a continuous manner. In order to allow
processing thereof by the computational cores, the encoder 15 or
encoders 9, 10, 11 have to segment the streams into inputs, each
with a finite length. The input streams, which are encoded
according to the received external data stream 5, may be segmented
according to various segmentation methods. Such segmentation
methods are well known and will not, therefore, be described here
in detail.
[0133] In one embodiment of the present invention, more than one
computational layer 1 is connected in parallel to a common input.
An example for such architecture is shown in FIG. 15A that depicts
a digital stream, which is encoded according to a received external
data stream and is divided between the computational cores
according to a hard-coded division method. It should be noted that
the external data stream may be divided according to different
properties of the external data stream. FIG. 15B depicts another
embodiment of the present invention in which the external data
stream is divided according to a dynamic division method. In such a
division method, different segments are transmitted in parallel to
different cores. The segment that one computational core receives
may have a different length from those received by other
computational layers. The segments which are received by different
computational cores may overlap.
[0134] Reference is now made to FIG. 13, which is a schematic
representation of a proactive computational unit 120, according to
one embodiment of the present invention. The RAC unit 26, the
computational layer 1, and the connections between them are as
depicted in FIG. 12, however, FIG. 13 further depicts a set of
additional layers N-3, N-2, N-1, N which are sequentially connected
to each other, where there are N layers in total.
[0135] The number of computational cores in each computational
layer may be different. The distribution of the cores in the layers
is task-dependent and is preferably performed dynamically. The
allocation of the number of cores per layer M and the number of
layers N in the proactive computational unit 120 is determined by
the RAC unit 26, in a manner such that N*M remains constant. The
RAC unit 26 communicates with each one of the computational layers
1 . . . N-3, N-2, N-1, and N through a related set of control pins,
as shown at 27. The computational layers are preferably connected
in a sequential order.
[0136] FIG. 16A, which is a schematic representation of two
computational layers 22 and 23, depicts such a connection. The
communication between the two computational layers 22 and 23 takes
place through the external input pins 28 and 30 and external output
pins 29 and 31 of the layers, respectively. As depicted in FIG.
16A, the communication is from the external output pins 29 of the
first layer 22 to the external input pins 30 of the second layer
23. Such an embodiment allows the outputs of the first layer 22 to
be integrated in time before they are entered into the second layer
23. An example of such time integration can be found in FIG. 16B,
which is a graphical representation of the communication between
the first and second layers 22 and 23 during a certain period 170.
The first layer 22 is depicted in three consecutive time periods
32, 33, 34, during which it sends respective outputs 35, 36, 37 to
a buffer 40. The buffer 40 gathers all the received outputs 35, 36,
37 and integrates them into a new data stream 38. The new data
stream 38 is sent to the second layer 23 in period 39.
[0137] The architecture of the computational layers and cores is
adaptively reconfigurable in time. The configuration at the
computational cores' level is manifested by allocation of available
cores for a specific sub-instruction, while another sub-instruction
may be executed using a different configuration of the cores. For
example, as depicted in FIG. 17A, which is a graphical
representation of a computational layer in three different
sub-instructions, for each sub-instruction 41 42 43 different
configuration of the cores is used.
[0138] The configuration at the layers' level is depicted in FIG.
17B. The Figure depicts two possible connection schemes 120 and 121
between the computational cores of a first computational layer and
two other consecutive computational layers. The configuration of
the connections is dynamically arranged by changing the connection
between one or more computational layers. As depicted in FIG. 17B,
the connections between the external output pins of one
computational layer and one or more external input pins of another
computational layer are reconfigurable.
[0139] Reference is now made, to FIG. 18, which is a schematic
representation of the computational core 100 of FIG. 2 and a
connection thereof to the RAC unit 26. As depicted in FIG. 18, the
exemplary control unit 50 is connected to the RAC unit 26 via an
I/O control BUS and I/O control pins 180. As described above, each
one of the computational cores 100 is designed to operate in both
learning and operational modes.
[0140] During the operational mode, as described above, each one of
the cores is designed to generate a core output, such as a binary
value or a binary vector, if a match has been found between the
information, which is stored in one of its registers, and the
presented input. In the simpler embodiments the core output is a
binary value. Thus, only when a computational core identifies the
presented input will it generate an output. As all the
computational cores are connected to the RAC unit 26, the RAC unit
can identify when one of the computational cores has identified the
presented input. This allows the execution of a "winner-takes-all"
algorithm. When such a scheme is implemented, if one of the cores
recognizes the presented input, it raises a designated flag and
thereby signals the RAC unit, that the presented input has been
identified.
[0141] In a preferred embodiment of the present invention, the
computational layer enters the learning mode if none of its
computational cores 100 recognizes the presented input. When the
presented input is not recognized by any of the computational
cores, the entire layer switches to learning mode. As each one of
the computational cores 100 is connected to the RAC unit via a
separate connection, this allows the RAC unit to recognize when a
certain input is not recognized by any of the computational cores
100. Preferably, each computational core 100 signals the RAC unit
26 or a central computing device that it did not recognize the
received input by changing or retaining a binary value in control
unit 50. The computational layer stays in the learning mode until
at least one of the cores recognizes the presented input and
signals the RAC unit 26, preferably by raising a flag.
[0142] As described above, the proactive computational unit is
based on asynchronous and parallel operation of multiple
computational cores. Such a proactive computational unit may be
used for various signal-processing applications.
[0143] Reference is now made to FIG. 19, which is a schematic
representation of a computational layer 1 that is connected to a
single encoder 70, similar to that shown in FIG. 14B, according to
another embodiment of the present invention. As depicted in FIG.
19, an external data stream 5, such as a voice waveform, an image
waveform, or any other real world output, is encoded by the encoder
70. Based thereupon, the encoder generates an encoded signal 171,
such as a digital stream or a signal in any other desired format.
It should be noted that, as different signal-type-dependent
encoders may preprocess different kinds of signals, the encoder 70
which is used is chosen according to the received signals. The
encoded signal 171 is transferred, as described above, to the
computational layer 1 in a manner such that each computational core
100 receives the entire input. Now, each one of the computational
cores processes the received encoded signal 171 and, based
thereupon, generates an output, such as a binary value. As the
internal architecture of all the computational cores 100 is
generated in a random manner, according to different parameters, in
order to ensure distribution and heterogeneity among the
computational cores, each core maps the given signal into a
different location.
[0144] Reference is now made to FIG. 25B, which is a schematic
representation of a computational layer 1. Each computational core
100 and the external output and input pins 48 and 49 are as
depicted in FIG. 12. In FIG. 25B, however, the computational layer
1 further comprises a memory array 87. Each one of the
computational cores 100 is preferably connected to a different cell
in the memory array. As described above, each one of the
computational cores is randomly structured. Therefore, the reaction
of different computational cores to a certain signal is not
homogenous. As each computational core is randomly structured, the
scope of possible outputs of the liquid section 46 of the
computational core can be represented in a three dimensional space,
as depicted in FIG. 20. The outputs of different computational
cores 100 are transmitted to different locations in the space, as
shown at 71 and 72 of FIG. 20. The transformation of the outputs of
the computational cores 100 into a setting on a spatial-temporal
map is a non-linear process that enables the generation of complex
spatial maps of different groups of computational cores. In order
to adjust a unique spatial map to a particular input signal, one or
more reporting LTUs are chosen in each one of the computational
cores. The number of reporting LTUs which are defined in a certain
computational core for one input signal varies between one LTU and
the total number of LTUs of the computational core. That is
anything between one and all of the LTUs can report for any given
input signal.
[0145] Preferably, in order to increase the scope for identified
signals, the reporting LTUs may be defined using a time function.
For example, as shown in FIG. 21, a certain computational layer
comprises twelve computational cores with different LTUs as
reporting LTUs in different time quanta 73, 74, and 75. For
example, in the first time quantum 73, only one reporting LTU,
which is marked as LT66, is chosen as a reporting LTU. Two
reporting LTUs, which are marked as LT66 and LT89, are chosen in
the second time quantum 74. In the third time quantum 75 two
different time reporting LTUs, which are marked as LT66 and L7A4,
are chosen.
[0146] As described above, each one of the LTUs outputs a binary
value, thus by choosing one reporting LT, the space represented by
each core is divided into two sub-spaces/planes, and a given signal
is ascribed to only one sub-space. Respectively, by choosing two
reporting LTUs, a two bit response is possible and the space is
divided into four sub-spaces. Thus a given signal is ascribed to
one sub-space of four. As different subspaces are associated with
different signals, each core may be used to identify a number of
different signals. FIG. 22 is a graphical representation of the
division of a certain space into two subspaces by using one
reporting value, as shown at 77. By choosing two reporting LTUs,
the space may be divided into four subspaces, as shown at 78. By
adding additional reporting LTUs, one can divide the space of a
certain computational core as much as necessary.
[0147] During the learning process, the system preferably receives
a number of samples of a given signal, and these are sent to the
various cores to learn the signal. The variations of the signal are
typically the signal with added noise, the same word spoken by
people with different accents etc. In order to ensure the
identification of variations of the given signal during the
operational mode, the computational core has to locate all the
variations of the same signal in the same sub-space. Since the
sub-spaces, generated by dividing the total-space with several
reporting LTUs, are quite large, the task of clustering the signal
into one sub-space is feasible.
[0148] As described above, since all the cores in the system are
heterogeneous, each core represents the given signal differently
within its own space, thus generating n different signal spaces
where n denotes the number of cores in the computational layer.
Thus, each input signal is located by n computational cores in n
different signal spaces.
[0149] Reference in now made to FIG. 23, which is a set of graphs
representing the transformation of a signal into n different
spaces, each corresponding to one of the computational cores. This
set depicts a projection of signals by each of the computational
cores into two-dimensional spaces, representing the state indicated
in this example by the LTUs. Each dot 79 in the n graphs represents
a core state of one of the n computational cores to 1 sample of a
given class. Functions f1, . . . , fn divide the core spaces such
that more than 50 percent of the signals all core outputs of a
certain signal are mapped into the same subspace or plane.
Preferably, for each given signal received by the computational
layer during the learning mode, a unifying three-dimensional
subspace is generated by conjugating all the subspaces that were
generated by different computational cores during the learning
process. An example of such a conjugation process is exemplified by
FIG. 24, wherein there are depicted two different subspaces 191 and
192, which have been generated by different computational cores
during the learning process in response to a certain signal. The
two different subspaces 191 and 192 are designed to exploit the
combined decision making capabilities of the two cores as depicted
in the example of FIG. 23, and to identify a certain signal during
the operational mode.
[0150] In such an embodiment, the learning process may be divided
into several steps:
1) Indexing LTU--associating one or more reporting LTUs with a
novel signal. 2) Mapping--allowing all the computational cores of
the computational layer to receive the novel signal several times.
3) Defining--storing a set of computational cores as reporting
cores. The set may comprises some or all of the cores. The chosen
reporting cores are preferably computational cores that
consequently identify the novel signal or a set of signals
belonging to the same class. Each reception of the novel signal or
a signal belonging to a set of signals of the same class reduces
the number of reporting cores, as fewer computational cores
consequently identify the novel signal as the number of reception
iterations increases. Preferably, the reception iterations last
until a stable signal, representing a conjugated subspace,
remains.
[0151] The table, which is depicted in FIG. 25A, depicts the
outputs of predefined reporting LTs of a computational layer with
twelve cores. Each computational core has a common reporting LTU,
which is designed for seven reception iterations for each novel
signal. A table cell, which is colored gray, indicates that the
related computational core reacts to the reception of the novel
signal during the related reception iteration. A table cell colored
white indicates that the related computational core did not react
to the novel signal in the related reception iteration. In the
exemplary table, all the computational cores output a response in
the first reception iteration. At this stage, all the cores may be
considered as reporting cores. In response to the second reception
iteration, computational core 8 is assumed to be unstable and is
excluded from the group of reporting cores. In the third reception
iteration, computational cores 1 and 11 are also removed from the
group of reporting cores. After the seven reception iterations,
only the most stable cores 2, 7, and 12 are left in the group.
Preferably, a minimum number of computational cores are defined, in
order to avoid emptying or over-diminishing the group of reporting
cores during the reaction iterations.
[0152] Preferably, for each novel signal, reporting cores are
chosen according to statistical analysis. In such an embodiment, a
reporting core is chosen according to a certain threshold, such as
the percentage of positive responses to the novel signal within a
given set of reception iterations. For example, if the threshold is
set to 100% only computational cores 2, 7 and 12 are considered as
reporting cores. If the threshold is set to 80%, cores 3 and 6 are
also considered as reporting cores.
[0153] Preferably, at the end of the learning process, after
reporting cores are defined, the reporting cores and the index of
the corresponding signal are stored in a memory array 87, as shown
in FIGS. 25B and 25C. During the operational mode, the memory array
is matched with the outputs of the computational cores, and if
there is a match between the response of the computational cores
and a particular memory column, a relevant signal index is
extracted and transmitted via the external pins of the
computational layer.
[0154] Reference is now made to FIG. 26, which is a schematic
illustration of a computational core 131 for processing one or more
data streams, in accordance with one embodiment of the invention.
The liquid section 46 and the linker section 47 are as depicted in
FIG. 2. In FIG. 26, however, the computational core 131 further
comprises an encoding unit 132, say for uses such as identifying
viruses in incoming data. The computational core 131 is a hybrid
analog-digital circuit which maps temporal segments of binary
streaming data {right arrow over (S)}(|t<t.sub.s|) into cliques
or signatures. As described above, binary values are represented by
two constant voltage levels V.sub.high and V.sub.low. The liquid
section 46 is defined for passing, blocking, or classifying
received inputs. The unique signature of the received input, the
clique, is represented in the linker section 47 of the
computational core as an LTU clique. An LTU clique is a vector with
a finite length, having several discrete values. The values of the
LTU clique vector encode the LTUs that were found to be responsive
to certain input. Such an embodiment allows the association of
unique strings, regular expressions, video streams, images,
waveforms, etc., with a certain clique in a manner that enables
their identification, as described below.
[0155] Each input to be recognized defines a unique clique in each
one of the computational cores, which is configured during the
programming stage. As a result, the number of LT cliques is
determined according to the number of external data streams, which
have been identified as possible inputs, for example, a set of
strings or regular expressions. As described above, such an
embodiment allows parallel processing of the data by multiple
computational cores.
[0156] Preferably, one or more of the LT cliques encode several
identified external data streams. For example, several strings and
regular expressions may be associated with the same LT clique. The
linker section 47 is designed to identify the cliques during the
learning process. During the operational mode, the linker section
47 has to output a relevant LT clique whenever a specific external
data stream is identified by the liquid section 46, so that
identified features of the data stream are represented by the
clique. Thus the linker serves to map a clique onto an Output as
per the function:
Output=linker (clique).
[0157] The linker section 47 may be implemented as a pool of simple
LTUs, connected to the liquid section by CNUs. Preferably, during
the learning process, the weights of the CNUs are defined according
to the response probability for identifying an external data
stream, which is obtained from each LTU. The linker section may
also have other implementations, depending on the definition of the
linker section. The CNUs in the liquid section 46 are as described
above in relation to FIG. 5.
[0158] Reference is now made to FIG. 27, which is a schematic
representation of a computational layer 1, according to another
embodiment of the present invention. While the computational layer
1 is similar to that of FIG. 19 and the computational cores 131 are
as depicted in FIG. 26, a number of new components are added in
FIG. 27.
[0159] As described above, the computational layer 1 is designed to
allow parallel processing of an external data stream by a large
number of computational cores. The external data stream is input to
each one of the computational cores in parallel. The input is
preferably continuous in time.
[0160] As described above, each computational core 131 comprises an
encoding unit 132. The encoding unit is configured to continuously
encode received input data and to forward it to the liquid section
v(.cndot.).
[0161] Reference is now made to FIG. 28, which is a schematic
illustration of the encoding unit 132 and the external data stream,
according to a preferred embodiment of the present invention. As
depicted, the encoding unit 132 transforms the external data stream
5 into decimal indices 136. The decimal indices 136 determine which
input LTUs receive a certain portion of the external data stream 5.
For example, if the decimal indices 136 designate the line 7A to a
certain input LTU 137, line 7A will be transmitted directly via LTU
137. The encoding unit 132 preferably comprises a clock 138 which
is used during the encoding process. Preferably, the encoding unit
132 is designed to encode a predefined number of n bits each
clock-step.
[0162] The number of bits per clock step is encoded into one of the
decimal indexes, and defines the size of the liquid section, which
is needed to process the encoded input. The size N of the liquid
section size is a function of n, and may be described by:
N.gtoreq.2.sup.n (5)
The implementation of the encoder may vary for different values of
n.
[0163] Reference is now made, once again, to FIG. 27. Each one of
the computational cores 131 is designed to produce D different
kinds of core outputs at any given time for a given computational
task, such as matching a string or regular expression
identification. The core outputs may be a binary value
D={.sub.0.sup.1, or a discrete value D={.sub.0.sup.n. Preferably,
the core outputs are the discrete values, which are represented by
n cliques of LTUs 133. Such an embodiment allows each computational
core to identify n different signals 171, such as strings or
regular expressions, following encoding by the encoder 130 in the
received external data stream.
[0164] In such an embodiment, the computational core forms a
filter, which ignores unknown external data streams and categorizes
only those external data streams which were recognized. As depicted
in FIG. 27, the LTUs of a certain clique are connected to a cell in
an array 112 that represents the cliques.
[0165] In one embodiment of the present invention, the
computational core 100 is designed to indicate whether or not a
certain data stream has been identified. In such an embodiment, all
the cells in the array 112 are connected to an electronic circuit
113, such as an OR logic gate, which is designed to output a
Boolean value based upon all the values in the cells. In such an
embodiment, the output 114 may be a Boolean value that indicates to
a central computing unit that the computational core has identified
a certain data stream.
[0166] In another embodiment, the computational core is designed
not merely to indicate that identification has been made but to
indicate which data stream has been identified. In such an
embodiment, the electronic circuit 113 allows the transferring of a
Boolean vector. In such an embodiment, the clique itself and/or the
value represented by the clique can be transferred to a central
computing unit.
[0167] As described above, the computational core can operate in
learning and operational modes, melting and freezing. During the
learning mode, new inputs are transferred in parallel to all the
computational cores.
[0168] Reference in now made to FIG. 29A, which is a schematic
representation of a computational core according to the present
invention. The linker section 47 and the liquid section 46 are as
depicted in FIG. 2. In FIG. 29A, however, there are further
depicted the associations between members of an array of LT cliques
12 and different LTUs in the liquid section 46.
[0169] The linker section 47 comprises an array of LT cliques 12.
Each member of the array of LT cliques 12 is configured to be
matched with a certain clique signature within the response of the
liquid section 46. For example, in FIG. 29A the members of a
certain clique signature in the array of LT cliques 12 are colored
gray and are connected to the representation of the clique 12
within the linker section 47 with a dark line.
[0170] During the learning process, every identified signal or a
class of identified signals is associated with a different member
of the array of LT cliques 12. The associated member is used to
store a set of values representing the LTUs of the related LT
clique, wherein each one of the LTUs in the set is defined
according to the following equations:
LT.sub.i.di-elect cons.Clique(S.sub.j) if
Q.sub.i=P(LT.sub.i=1|S.sub.j)>>P(LT.sub.i=1)
where for each LT.sub.i of the core, a probability of response
given a desired string, as denoted by S.sub.j. The probability is
calculated and compared with the probability of response, given any
other input. This is calculated by presenting a large number of
random inputs. The Clique is composed of those LT.sub.i for which
the probability of response given a desired
string/regular-expression is much higher than the probability to
respond to any other input. The Q.sub.i is calculated for each
LT.sub.i of the core and compared against a certain threshold
Q.sub.th. Thus, a reduced, selected population of LTs is defined as
clique by:
Clique={LT.sub.i|Q.sub.i>Q.sub.th}.
[0171] FIG. 29A, is a computational core 100, as depicted in FIG.
9A, and is shown during the learning process. As depicted in FIG.
29B a number of LTUs 350 identify the received external data stream
250, however, only some of them 351 have a higher probability of
response to the receive external data stream or to the derivative
thereof as to the probability of response to any other identified
input. The LTUs with the higher probability are stored as unique
pattern or signature for "class 1" representing the received
external data stream 250, for example as "class 1".
[0172] During the operational mode, the LT clique 351 is used to
classify the received external data stream 250. FIG. 29B shows a
computational core 100, of the kind depicted in FIG. 9B. In FIG.
29B an external data stream 250 is received and analyzed by the
computational core 100, during operational mode. As depicted, the
received external data stream 250 is identified by a group of LTUs
450 that comprises the previously identified LT clique 351 that
have a higher probability of response to the receive external data
stream or to the derivative thereof than to the probability of
response to any other identified input. As the group of LTUs 450
that identify the received external data stream 250 comprises the
members of the LT clique 351, the computational core can classify
the received external data stream 250 according to the class which
has been assigned to it during the learning process 452.
[0173] The Q.sub.i is calculated for each LT of the core and is
compared against a certain threshold Q.sub.th. Thus, we define a
reduced, selected population of LTs, as a clique by:
Clique={LT.sub.i|Q.sub.i>Q.sub.th}.
[0174] In another embodiment the learning may be implemented in the
following way: [0175] 1) Defining all the LTUs as reporting LTs.
[0176] 2) Injecting a novel signal or signals from a certain class
of signals into each computational core. [0177] 3) Checking the
stability of the responses of each reporting LT to the injection.
[0178] 4) Extracting the reporting LTs which have a stability below
a predefined threshold from the group of the reporting LTs. [0179]
5) In such a manner different reporting LTs are chosen for each one
of the computational cores.
[0180] An example of such a clique selection for one computational
core is shown in the graph which is depicted in FIG. 30, in which
the y-axis is the probability Q.sub.i for a certain identified
input, such as a string, to be identified by a certain LT.sub.i and
the x-axis is the index of the LT.sub.i. Dot 18 exemplifies the
Q.sub.i for a particular LT.sub.i. Preferably, all values of
LT.sub.i where Q.sub.i is higher than the predefined Q.sub.th, as
shown at 16, are included in the LT clique, as shown at 17. It
should be noted any other manner that allows the identification of
LTUs that are suitable to the introduced input might also be
implemented. During the operational mode, it is assumed that the
array of LT cliques 12 is defined.
[0181] Reference is now made to FIG. 31, which is a graphical
representation of the computational layer 1, according to a
preferred embodiment of the present invention. As described above,
each one of the computational cores 131 is configured to identify a
number of signals or a class of signals. Each signal is reflected
by the output of the LTUs of the liquid section that belong to the
clique associated with a certain class of signals, as described
above. While the computational layer 1 is as depicted in FIG. 27,
in FIG. 31, however, there is further depicted an electronic
circuit 141 for implementing a majority voting algorithm. As
described above, each one of the computational cores 131 is
designed to generate a core output, which reflects which signal has
been identified in the introduced external data stream. It should
be noted that as a majority voting algorithm is used the process is
relatively fault tolerant. If one of the computational cores which
have been configured to identify the signal failed to do so, the
identification will still be carried out correctly as the majority
of the computational core will identify the signal. Clearly, as the
process is relatively fault tolerant, individual cores do not have
to be perfect and the production yield is radically improved since
imperfect chips can still be used. Thus the production cost for the
VLSI chip decreases.
[0182] Moreover, such an embodiment allows the processing of
ambiguous and noisy data as the majority voting identification
process improves radically the performance.
[0183] FIG. 31 depicts an integrated circuit 141 which is connected
to receive all the core outputs of all the computational cores 131.
The integrated circuit 141 is designed to receive all the core
outputs which are received in response to the reception of a
certain external data stream, for example, string S.sub.i (see
below). The integrated circuit 141 is designed to implement a
"majority voting" algorithm that is preferably defined according to
the following functions:
D NLA , S j = D i , S j f i = max { f k } ##EQU00002## f k = k l w
k .delta. ( D k , S j , D l , S j ) , ##EQU00002.2##
wherein .delta. denotes a discrete metric function; k and j denote
indices of the grid of computational cores; w.sub.k is the weight
of each core in the voting process, which is preferably assumed to
be equal to 1 in simple realizations; and f.sub.k denotes the
weighted "voting rate" of a subgroup of the LT clique which is
associated with certain output, D.sub.k,S.sub.j, to input
S.sub.j.
[0184] Thus the final output of computational layer 1, in response
to a certain external data stream which has been identified as
S.sub.j, is the output that is defined by the maximal voting rate
f.sub.i within the array of LT cliques.
[0185] The programming and adjusting process, including, for
example, characterization of arrays of LT cliques, setting the
parameters of a certain LT clique, and programming of any other
realization of a linker can be performed during the various phases.
Such programming can be done by software simulation prior to the
manufacturing process. In such a manner, the computational layer
can be fully or partially hard-coded with programmed tasks, such as
matching certain classes of strings or identifying certain subsets
of regular expressions or any other classification task.
Preferably, dynamic programming of the linker may be achieved by
adjusting the linker of the computational layer in a reconfigurable
manner. For example, in the described embodiment, an array of LT
cliques can be defined as a reserve and the parameters can be
determined by fuses or can be determined dynamically by any other
conventional VLSI technique. Preferably, the same LT cliques can be
reprogrammed to allow identification of different subsets of
strings and regular expressions.
[0186] The output of the computational layer may vary according to
the application that utilizes the computational layer. For content
inspection say to detect viruses, for example, the output of the
computational layer is binary: 0 for ignoring the injected input
(letting it pass) and 1 for blocking the injected input if a match
was identified (meaning a suspicious string has been identified),
or vice versa.
[0187] Preferably, if the used application is related to
information retrieval or data processing, an index of the
identified string or regular expression is produced in addition to
the detection result.
[0188] Reference is now made to FIG. 33, which is a graphical
representation of a diagram of a computational layer 1, as depicted
in FIG. 11A, which further comprises a number of voting components
2008, input preprocessing components, and a signature selector
2006, according to one embodiment of the present invention. FIG. 33
depicts a de-multiplexer (demux) 2001 that encodes the information
which is received from a serial FIFO component 2000 and forwards it
to a preprocessing unit 2002 that preprocess the received
information and generates an output based thereupon. The
preprocessed outputs are forwarded to an input buffer 2003, which
is designed to allow the injecting of the preprocessed outputs to a
number of network logic components 2004. Each network logic
component 2004 is defined as the aforementioned liquid sections.
The network logic components 2004, as the liquid sections above,
are designed to output a unique signature that represents a pattern
that has been identified in the received information. Each one of
the network logic components 2004 is separately connected, via a
network buffer 2005, to a linking component 2007. Each linking
component 2007 is defined, as the aforementioned linker section, to
receive the outputs of a related network logic component 2004 and
to output a discrete value based thereupon. The linking component
2007 comprises a number of records. Each record is defined, during
the learning mode, to be matched with a unique output of the
network logic components 2004. Each one of the linking components
2007 receives the unique output from a related network logic
component 2004 and matches it with one of his records.
[0189] Preferably, a number of different discrete values are stored
in each one of the records. Each one of the different discrete
values constitutes a different signature which is associated which
the unique output of the linking components 2007. In the depicted
embodiment, the linking component 2007 forwards each one of the
different discrete values, which constitutes a different signature,
to one of a number of different designated voting components 2007.
Each voting component 2007 is designed to apply a voting algorithm,
as described above, on the received discrete values. Such an
embodiment can be extremely beneficial for processing signals that
documents the voices of more than one speaker. Each one of the
voting components 2007 may be designed to receive signatures which
are assigned to indicate that a pattern, associated with one of the
speakers, has been identified by one or more of the network logic
components 2004. In another embodiment, such an embodiment can be
used to perform several tasks in parallel on the same data stream.
For example the same voice signal may be processed simultaneously
to identify the speaker, the language, and several keywords.
[0190] Reference is now made to FIG. 32, which is a flowchart of an
exemplary method for processing an external data stream using a
number of computational cores, such as the aforementioned
computational cores, according to a preferred embodiment of the
present invention. During the first step, as shown at 1400, an
external data stream is received. As described above, the external
data stream may originate from a sensor that captures signals from
the real world. The received external data stream may be encoded by
a designated computing device before it is processed. As described
above, in order to process the external data stream, a number of
computational cores are used in parallel. Therefore, during the
following step, as shown at 1401, the external data stream is
directly transferred to a number of different computational cores.
As described above in relation to the liquid section, each one of
the computational cores is associated with an assembly that has
been structured according to a unique pattern of processors. During
the following step, as shown at 1402, each one of the computational
cores uses the associated unique pattern of processors for
processing the external data stream. Then, as shown at 1403, the
outputs of all the processing devices are collected. Such a
collected output can be used for signal analysis, identification
and classification, as further described and explained above.
Preferably, two additional steps are added to the depicted process.
After the core outputs are collected, a voting algorithm, such as
the majority voting algorithm is used, to choose one of the core
outputs. In the final step, the chosen core output is then
forwarded to a certain application that utilizes the information or
present it to a user.
[0191] It is expected that during the life of this patent many
relevant devices and systems will be developed and the scope of the
terms herein, particularly of the terms computational cores,
computation, computing, data stream, sensor, signal, and
computational core are intended to include all such new
technologies a priori.
[0192] It is appreciated that certain features of the invention,
which are, for clarity, described in the context of separate
embodiments, may also be provided in combination in a single
embodiment. Conversely, various features of the invention which
are, for brevity, described in the context of a single embodiment,
may also be provided separately or in any suitable
sub-combination.
[0193] Although the invention has been described in conjunction
with specific embodiments thereof, it is evident that many
alternatives, modifications and variations will be apparent to
those skilled in the art. Accordingly, it is intended to embrace
all such alternatives, modifications and variations that fall
within the spirit and broad scope of the appended claims. All
publications, patents, and patent applications mentioned in this
specification are herein incorporated in their entirety by
reference into the specification, to the same extent as if each
individual publication, patent or patent application was
specifically and individually indicated to be incorporated herein
by reference. In addition, citation or identification of any
reference in this application shall not be construed as an
admission that such reference is available as prior art to the
present invention.
* * * * *