U.S. patent application number 12/974689 was filed with the patent office on 2012-06-21 for data path for data extraction from streaming data.
This patent application is currently assigned to INTERNATIONAL BUSINESS MACHINES CORPORATION. Invention is credited to Francois Abel, Jean L. Calvignac, Christoph Hagleitner, Jonathan B. Rohrer, Jan Van Lunteren, Fabrice J. Verplanken.
Application Number | 20120155492 12/974689 |
Document ID | / |
Family ID | 46234377 |
Filed Date | 2012-06-21 |
United States Patent
Application |
20120155492 |
Kind Code |
A1 |
Abel; Francois ; et
al. |
June 21, 2012 |
Data Path for Data Extraction From Streaming Data
Abstract
A data path for streaming data includes a plurality of
sequential data registers, each of the plurality of sequential data
registers comprising a plurality of data fields, wherein the
streaming data moves sequentially through the sequential data
registers; and a multiplexing unit, the multiplexing unit
configured such that the multiplexing unit has access to each of
the plurality of data fields of the plurality of sequential data
registers, and wherein the multiplexing unit is configured to
extract data from the streaming data as the streaming data moves
through the sequential data registers in response to a data
request.
Inventors: |
Abel; Francois; (Sierentz,
FR) ; Calvignac; Jean L.; (Raleigh, NC) ;
Hagleitner; Christoph; (Zurich, CH) ; Rohrer;
Jonathan B.; (Zurich, CH) ; Van Lunteren; Jan;
(Gattikon, CH) ; Verplanken; Fabrice J.; (La
Gaude, FR) |
Assignee: |
INTERNATIONAL BUSINESS MACHINES
CORPORATION
Armonk
NY
|
Family ID: |
46234377 |
Appl. No.: |
12/974689 |
Filed: |
December 21, 2010 |
Current U.S.
Class: |
370/474 |
Current CPC
Class: |
H04J 3/1682
20130101 |
Class at
Publication: |
370/474 |
International
Class: |
H04J 3/24 20060101
H04J003/24 |
Claims
1. A data path for streaming data, comprising: a plurality of
sequential data registers, each of the plurality of sequential data
registers comprising a plurality of data fields, wherein the
streaming data moves sequentially through the sequential data
registers; and a multiplexing unit, the multiplexing unit
configured such that the multiplexing unit has access to each of
the plurality of data fields of the plurality of sequential data
registers, and wherein the multiplexing unit is configured to
extract data from the streaming data as the streaming data moves
through the sequential data registers in response to a data
request.
2. The data path for streaming data of claim 1, wherein the
multiplexing unit comprises a plurality of multiplexers, each of
the plurality of multiplexers having access to each of the
plurality of data fields of the plurality of sequential data
registers.
3. The data path for streaming data of claim 1, further comprising
an admit logic configured to transfer the streaming data from a
medium access control (MAC) interface into a first data register of
the plurality of sequential data registers.
4. The data path for streaming data of claim 2, wherein the admit
logic is configured to transfer a portion of the streaming data
from a MAC interface into a first data register of the plurality of
sequential data registers in response to streaming data being
available from the MAC and the first data register being empty.
5. The data path for streaming data of claim 1, further comprising
a transfer logic, the transfer logic being associated with a
respective pair of sequential data registers of the plurality of
sequential data registers, the transfer logic being configured to
transfer the streaming data from a first data register of the
respective pair of sequential data registers to a second data
register of the respective pair of sequential data registers.
6. The data path for streaming data of claim 4, wherein the
transfer logic is configured to transfer a portion of the streaming
data from the first data register of the respective pair of data
registers to the second data register of the respective pair of
sequential data registers in response to the first data register of
the respective pair of sequential data registers being full and the
second data register of the respective pair of sequential data
registers being empty.
7. The data path for streaming data of claim 4, further comprising
a plurality of transfer logics, each transfer logic being
associated with a respective pair of sequential data registers of
the plurality of sequential data registers, and wherein a number of
the plurality of transfer logics is equal to a number of the
plurality of sequential data registers minus one.
8. The data path for streaming data of claim 1, further comprising
an eject logic configured to eject the streaming data from a last
data register of the plurality of sequential data registers to a
received packet processor (RPP) module.
9. The data path for streaming data of claim 1, wherein the eject
logic is configured to eject a portion of the streaming data in
response to the last data register of the plurality of sequential
data registers being full, the RPP module being empty, and an index
of the streaming data in the last data register being less than an
index of a lowest-indexed pending data request.
10. The data path for streaming data of claim 1, wherein the data
path is configured to: receive a plurality of data requests, each
of the plurality of data requests having a respective index in the
streaming data, from a stream processor; extract data from the
streaming data in the plurality of sequential data registers via
the multiplexing unit in response to the plurality of data requests
based on the respective indices; and send the extracted data to the
stream processor.
11. A method for extracting data from streaming data in a data
path, comprising: moving the streaming data sequentially though a
plurality of sequential data registers of the data path, each of
the plurality of sequential data registers comprising a plurality
of data fields; and extracting data from the streaming data as the
streaming data moves through the sequential data registers by a
multiplexing unit in response to a data request, wherein the
multiplexing unit is configured such that the multiplexing unit has
access to each of the plurality of data fields of the plurality of
sequential data registers.
12. The method claim 11, wherein moving the streaming data
sequentially though the plurality of sequential data registers
comprises transferring a portion of the streaming data from a
medium access control (MAC) interface into a first data register of
the plurality of sequential data registers in response to streaming
data being available from the MAC and the first data register being
empty.
13. The method of claim 11, wherein moving the streaming data
sequentially though the plurality of sequential data registers
further comprises transferring a portion of the streaming data from
a first data register of a pair of sequential data registers of the
plurality of sequential data registers to a second data register of
the pair of sequential data registers in response to the first data
register of the pair of sequential data registers being full and
the second data register of the pair of sequential data registers
being empty.
14. The method of claim 11, wherein moving the streaming data
sequentially though the plurality of sequential data registers
further comprises ejecting a portion of the streaming data from a
last data register of the plurality of sequential data registers to
a received packet processor (RPP) module in response to the last
data register of the plurality of sequential data registers being
full, the RPP module being empty, and an index of the streaming
data in the last data register being less than an index of a
lowest-indexed pending data request.
15. The method of claim 11, wherein extracting data from the
streaming data as the streaming data moves through the sequential
data registers by a multiplexing unit in response to a data request
comprises: receiving a plurality of data requests, each of the
plurality of data requests having a respective index in the
streaming data, from a stream processor; extracting data from the
streaming data in the plurality of sequential data registers via
the multiplexing unit in response to the plurality of data requests
based on the respective indices; and sending the extracted data to
the stream processor.
16. A computer program product comprising a computer readable
storage medium containing computer code that, when executed by a
computer, implements a method for extracting data from streaming
data in a data path, wherein the method comprises: moving the
streaming data sequentially though a plurality of sequential data
registers of the data path, each of the plurality of sequential
data registers comprising a plurality of data fields; and
extracting data from the streaming data as the streaming data moves
through the sequential data registers by a multiplexing unit in
response to a data request, wherein the multiplexing unit is
configured such that the multiplexing unit has access to each of
the plurality of data fields of the plurality of sequential data
registers.
17. The computer program product according to claim 16, wherein
moving the streaming data sequentially though the plurality of
sequential data registers comprises transferring a portion of the
streaming data from a medium access control (MAC) interface into a
first data register of the plurality of sequential data registers
in response to streaming data being available from the MAC and the
first data register being empty.
18. The computer program product according to claim 16, wherein
moving the streaming data sequentially though the plurality of
sequential data registers further comprises transferring a portion
of the streaming data from a first data register of a pair of
sequential data registers of the plurality of sequential data
registers to a second data register of the pair of sequential data
registers in response to the first data register of the pair of
sequential data registers being full and the second data register
of the pair of sequential data registers being empty.
19. The computer program product according to claim 16, wherein
moving the streaming data sequentially though the plurality of
sequential data registers further comprises ejecting a portion of
the streaming data from a last data register of the plurality of
sequential data registers to a received packet processor (RPP)
module in response to the last data register of the plurality of
sequential data registers being full, the RPP module being empty,
and an index of the streaming data in the last data register being
less than an index of a lowest-indexed pending data request.
20. The computer program product according to claim 16, wherein
extracting data from the streaming data as the streaming data moves
through the sequential data registers by a multiplexing unit in
response to a data request comprises: receiving a plurality of data
requests, each of the plurality of data requests having a
respective index in the streaming data, from a stream processor;
extracting data from the streaming data in the plurality of
sequential data registers via the multiplexing unit in response to
the plurality of data requests based on the respective indices; and
sending the extracted data to the stream processor.
Description
FIELD
[0001] This disclosure relates generally to the field of the
inspection and processing of a stream of data received over a
communication channel.
DESCRIPTION OF RELATED ART
[0002] Data transferred over a communication channel may be
provided in a streaming mode, which is a serial sequence of
information. An advantage of streaming data is that the receiver
may start processing the content of the data stream before the
entire stream is received. Streaming data may include various types
of content, such as video, audio, or multimedia content. Streaming
enables the data to be played back as soon as it is received, thus
reducing the delay at the receiver before the media can be
presented to the user.
[0003] Streaming data is supported by underlying networks and
protocols that are used to transport the data stream. Every
datagram, cell, packet, and frame transferred over a network is
formatted as a stream of bits. Sequential ordering of information
is inherent to protocols used to transfer streaming data over a
network, as sequential ordering of streamed information reduces the
amount of processing required at the receiver. The beginning of a
stream may contain one or multiple fields of information regarding
the rest of the stream. For example, the first digit of an internet
protocol (IP) datagram may indicate the IP version of the
stream.
[0004] Sequential ordering enables flexible protocol stack
combinations, which allow different sets of computers running
different high-level network protocols to share the same physical
media. For example, the stacking of transmission control protocol
(TCP) over IP protocol may be indicated by the protocol field of
the IP header datagram, which is transferred ahead of the payload
part of the IP datagram used to embed the TCP datagram. TCP over IP
is also referred to as protocol encapsulation because it turns a
given network layer into a trucking service that is unaware of the
data it carries for the upper layer applications, enabling
protocols to be deployed with flexible options. For example, the
next header field of an IPv6 datagram may encode a higher layer
protocol such as TCP or user datagram protocol (UDP), or may
indicate that the next datagram is one of the IPv6 extension
headers and that it is carrying some additional options related to
current IP protocol layer.
[0005] The receiver may need to extract one or more particular
fields, such as header information, from a stream of data as it is
received in order to properly process the rest of the data stream
according to the correct protocol. This may be achieved by reading
the streaming data into a temporary physical buffer, and advancing
a stream pointer through the buffer to extract the needed data from
its position in the stream. However, advancing a stream pointer
through a buffer may be a relatively slow process. Also, because of
protocol stacking and protocol encapsulation, the exact numbers of
bytes that make up a header stack may not be known before the
stream parsing is done. Therefore, an arbitrary number of bytes
must be allocated to the header buffer in order to account for both
short packets (e.g. a 42 bytes DIX/IPv4/UDP) as well as long
packets (e.g. a 206 bytes
DIX/IPv6+Destination+Routing+Fragment-extension-headers/TCP).
Although main memory may be cheap and abundant in today's
computers, it is a scarce resource for a system on a chip (SoC) and
similar VLSI devices, in which multiple functions share a limited
amount of area and power on the silicon. Because of memory
constraints, hardware packet parsers and protocol processors may
only process a limited amount of an incoming stream by dedicating a
finite number of buffers to the stream data. In general, only the
first few tens or hundreds first bytes can be processed, which may
corresponds to the minimum size required to hold the protocol stack
headers of a frame of some formats of streaming data. However,
other formats, such as InfiniBand (IB) or Ethernet, operate with
maximum transfer unit (MTU) sizes of multiple kilobytes.
[0006] The processing of variable length fields must also be
supported. For example, an IPv6 header is composed of field sizes
that span from 4 bits (e.g., the IP version) up to 128 bits (e.g.,
the IP source and destination addresses). Although large fields
used to be segmented into smaller fields of fixed size, the
solution is not practical at multi-gigabit/s data rates because of
the dependency between the data rate and the stream processing
frequency (processing frequency=data_rate/field size). The smaller
the field size, the higher the processing frequency needs to be.
Therefore, processing a 10 Gb/s Ethernet stream on the basis of
fixed field sizes of 8 bits has to be performed at a frequency of
1.25 GHz, which is relatively hard to achieve given the number of
fields that must be evaluated and processed in every clock cycle.
One solution for relaxing the processing frequency is widening the
size of the fixed field from 8 to 16, 32 or 64 bits, and to provide
some filtering mask mechanism for extracting shorter fields such as
the Ethernet type/len (16 bits) or the IP protocol (8 bits). For
example, the IP version (4 bits) can be extracted from a fixed
field of 16 bits by isolating the appropriate digit with a mask
such as 0xF000, 0x0F00, 0x00F0 or 0x000F. However, since most of
the network and media protocols are digit- and/or byte-based, the
management and the generation of such masks is a relatively complex
process.
[0007] A stream processing application may be difficult to
accelerate by means of parallelization techniques because of
intrinsic sequential data representation. For the case of network
protocol processing, this means that the beginning of a network
frame typically contains one or multiple fields indicating what the
rest of the frame is about. For example, the processing of the 5th
and 6th bytes of an IP datagram cannot start before the IP version
has been identified by processing the first digit of the datagram:
bytes 5-6 of IPv4 encode the "IPv4 Fragment Identification", while
bytes 5-6 of IPv6 encode the "IPv6 Payload Length".
SUMMARY
[0008] In one aspect, a data path for streaming data includes a
plurality of sequential data registers, each of the plurality of
sequential data registers comprising a plurality of data fields,
wherein the streaming data moves sequentially through the
sequential data registers; and a multiplexing unit, the
multiplexing unit configured such that the multiplexing unit has
access to each of the plurality of data fields of the plurality of
sequential data registers, and wherein the multiplexing unit is
configured to extract data from the streaming data as the streaming
data moves through the sequential data registers in response to a
data request.
[0009] In another aspect, a method for extracting data from
streaming data in a data path includes moving the streaming data
sequentially though a plurality of sequential data registers of the
data path, each of the plurality of sequential data registers
comprising a plurality of data fields; and extracting data from the
streaming data as the streaming data moves through the sequential
data registers by a multiplexing unit in response to a data
request, wherein the multiplexing unit is configured such that the
multiplexing unit has access to each of the plurality of data
fields of the plurality of sequential data registers.
[0010] In another aspect, a computer program product including a
computer readable storage medium containing computer code that,
when executed by a computer, implements a method for extracting
data from streaming data in a data path, wherein the method
includes moving the streaming data sequentially though a plurality
of sequential data registers of the data path, each of the
plurality of sequential data registers comprising a plurality of
data fields; and extracting data from the streaming data as the
streaming data moves through the sequential data registers by a
multiplexing unit in response to a data request, wherein the
multiplexing unit is configured such that the multiplexing unit has
access to each of the plurality of data fields of the plurality of
sequential data registers.
[0011] Additional features are realized through the techniques of
the present exemplary embodiment. Other embodiments are described
in detail herein and are considered a part of what is claimed. For
a better understanding of the features of the exemplary embodiment,
refer to the description and to the drawings.
BRIEF DESCRIPTION OF THE SEVERAL VIEWS OF THE DRAWINGS
[0012] Referring now to the drawings wherein like elements are
numbered alike in the several FIGURES:
[0013] FIG. 1 illustrates an embodiment of a data streaming system
including a data path for data extraction from streaming data.
[0014] FIG. 2 illustrates an embodiment of a data path for data
extraction from streaming data.
[0015] FIG. 3 illustrates another embodiment of a data path for
data extraction from streaming data.
[0016] FIG. 4 is a schematic block diagram illustrating an
embodiment of a method for an admit logic for a data path for data
extraction from streaming data.
[0017] FIG. 5 is a schematic block diagram illustrating an
embodiment of a method for a transfer logic for a data path for
data extraction from streaming data.
[0018] FIG. 6 is a schematic block diagram illustrating an
embodiment of a method for an eject logic for a data path for data
extraction from streaming data.
[0019] FIG. 7 is a schematic block diagram illustrating an
embodiment of a method for data extraction for a data path for data
extraction from streaming data.
[0020] FIG. 8 is a schematic block diagram illustrating an
embodiment of operation of a data path for data extraction from
streaming data.
[0021] FIG. 9 is a schematic block diagram illustrating an
embodiment of a computing system that may be used in conjunction
with a data path for data extraction from streaming data.
DETAILED DESCRIPTION
[0022] Embodiments of a data path for data extraction from
streaming data, and methods of operating a data path for data
extraction from streaming data, are provided, with exemplary
embodiments being discussed below in detail. Acceleration of
processing of sequential streaming data may be achieved by enabling
multiple variable length fields in the streaming data to be
extracted from any position in the stream at the same time. The
data path includes a pipeline made up of a plurality of data
registers, combined with one or more multiplexers. The pipeline
acts as a window sliding over the incoming data stream, while the
multiplexers may flexibly extract any required data from the
streaming data as it moved through the pipeline, thereby exposing
the streaming data to a stream processor.
[0023] Embodiments of a data path for data extraction from
streaming data may be implemented in hardware with a relatively
small footprint, thus minimizing the logic area utilization and
associated power consumption. Support for cut-through processing
may be provided, minimizing the processing latency. Only the
portion of the data stream that is currently being processed needs
to be buffered. Latency is further reduced by simultaneously
extracting data from a portion of a data stream while loading
additional data from the stream into the pipeline.
[0024] FIG. 1 illustrates an embodiment of a data streaming system
100 including a data path 103 for data extraction from streaming
data. Data streaming system 100 receives a data stream 101 from a
network at medium access control (MAC) 102. Data stream 101 may
include any type of streamed data, including but not limited to
video or audio. The data stream 101 is transferred from MAC 102 to
data path 103. Stream processor 104 requests specific data located
in data stream 101 from data path 103 (including but not limited to
data indicating the protocol of data stream 101), and data path 103
extracts the requested data from data stream 101 as described in
further detail below. The data stream 101 then proceeds to received
packet processing (RPP) module 104, which may perform such
functions as playback in embodiment in which data stream 101
includes audio or video data.
[0025] FIG. 2 illustrates an embodiment of a data path 200, which
may comprise data path 103 of FIG. 1. Input 201 receives a data
stream 202 from a MAC, such as MAC 102 of FIG. 1. The received data
stream 202 proceeds sequentially through a pipeline that includes
registers 203-204 en route to output 205, which sends the data
stream 202 to an RPP, such as RPP 104 of FIG. 1. Registers 203 and
204 include data fields 203A-D and 204A-D, respectively. Data from
data stream 201 first populates data fields 203A-D, then the data
in data fields 203A-D moves to data fields 204A-D in register 204.
Data path control 206 controls the progress of data stream 202 from
input 201 through registers 203-204 to output 205; the functions of
data path control 206 are discussed below in further detail with
respect to FIGS. 3-6. Registers 203-204 are shown for illustrative
purposes only; a data path 200 may include any appropriate number
of registers, the registers may include any appropriate number of
fields, and the fields may be any appropriate size, such as a byte
or multiple bytes, in various embodiments. The width of the data
contained in resisters 203-204 may be any number of bytes (e.g., 8
B, 16 B, 32 B, 64 B), though in some embodiments, the register
width may be aligned with the width of the underlying physical
interface (for example, 16 B for a 10 GbE MAC). The depth of the
pipeline (i.e., the number of registers) may be any appropriate
number, though a minimum of two registers may hide any latency
incurred by the loading and the flushing of the data in the
pipeline in some embodiments. For embodiments including network
protocol processing in which relevant fields are contiguous to each
other in the stream, a 32 byte pipeline (2 registers of 16 bytes
each) is sufficient to store most typically used packet
headers.
[0026] As data stream 202 passes through the pipeline comprising
registers 203 and 204, every byte of the data stream 202 is exposed
on the fly to multiplexing unit 207, as the multiplexing unit 207
is connected to each data field in the pipeline. Therefore, any
data in the data stream 202 may be inspected and extracted
regardless of its position in the stream 202 and the length of the
stream 202 as the stream 202 passes through registers 203-204.
Multiplexing unit 207 may include one or multiple multiplexers, and
each multiplexer of multiplexing unit 207 may have access to every
field of every register in the pipeline. Each multiplexer may
extract one unit of data from the registers 203-204 per unit of
time (for example, per clock cycle). Therefore, each additional
multiplexer included in multiplexing unit 207 in data path 200
allows extraction of an additional unit of data per unit of time,
resulting in faster processing of data stream 202, and also for
extraction of variable amounts of data from data stream 202.
[0027] Data request module 208 receives requests for data from a
stream processor, such as stream processor 104 of FIG. 1, fulfills
the received requests by extracting data from stream 202 data via
multiplexing unit 207, and sends the extracted data back to the
stream processor. The data request module 208 may queue the
received data requests until they can be fulfilled via the
multiplexing unit 207. The stream processor 104 may issue one
request per needed unit of data, or a single request may be for
multiple units of data. The unit of data may be the same as the
size of a data field (such as data fields 203A-D or 204A-D) in some
embodiments. A request may include an index indicating the position
of the requested data within the data stream. Alternatively,
multiple requests may be encoded by means of a single pointer
combined with a set of relative offsets (e.g., a 16 bits packet
pointer plus three relative offsets of 5 bits) in order to request
data located at multiple indices in the data stream. The data
request module 208 replies to every request from the stream
processor 104 with the requested data from the stream, which is
extracted from the stream via the multiplexing unit 207. Data
requests for 2 to 3 bytes are sufficient to parse various protocol
stack combinations that are used over Ethernet, such as ISL, DIX,
SAP/SNAP, VLAN, MPLS, PPPoE, IPv4, IPv6, TCP, or UDP. The data
request module 208 is discussed in further detail below with
respect to FIG. 7.
[0028] FIG. 3 illustrates another embodiment of a data path 300,
such as data path 103 of FIG. 1. Data path 300 includes three
registers 301-303, located between a data path input 309, which may
communicate with a MAC such as MAC 103 of FIG. 1, and a data path
output 310, which may communicate with an RPP such as RPP 105 of
FIG. 1. Each of registers 301-303 includes respective register data
(301A, 302A, 303A), which may be divided up into data fields of any
appropriate size; a status indicator (301B, 302B, 303B); and an
index counter (IC) (301C, 302C, 303C). The status indicators 301B,
302B, and 303B indicate whether the status indicator's respective
register data (301A, 302A, or 303A) is empty or full. The ICs 301C,
302C, and 303C each hold the highest index in the data stream of
the data in the IC's respective register data (301A, 302A, or
303A). The admit logic 304 is located between the first register
301 in the data path 300 and the data path input 309, and controls
admission of data from data path input 309 into register 301. Admit
logic 304 receives status data from the MAC (such as MAC 102 of
FIG. 1) at MAC status input 307 indicating whether the MAC has
received streaming data for the data path 300. Transfer logic 305A
is located between register 301 and register 302, and controls
movement of data from register 301 to register 302. Transfer logic
305B is located between register 302 and register 303, and controls
movement of data from register 302 to register 303. Eject logic 306
is located between the last register 303 in data path 300 and data
path output 310, and controls ejection of data from the data path
from register 303 to data path output 310. Eject logic 306 receives
status data from the RPP (such as RPP 105) at RPP status input 308
indicating whether the RPP has space to receive streaming data from
the data path 300. Admit logic 304, transfer logic 305A-B, and
eject logic 306 may together comprise a data path control module
such as data path control module 206 of FIG. 2, and are discussed
in further detail below with respect to FIGS. 4-6. Data request
module 312 receives requests for data from a stream processor such
as stream processor 104 of FIG. 1, and fulfils these requests via
multiplexer unit 311; this process is discussed in further detail
above with respect to multiplexer unit 207 and data request module
208 of FIG. 2, and below with respect to FIG. 7.
[0029] FIG. 4 illustrates an embodiment of a method 400 for an
admit logic, such as admit logic 304 of FIG. 3. In block 401, it is
determined if streaming data is available from the MAC based on MAC
status input 307. In block 402, it is determined if the register
data 301A of the first register 301 in the data path 300 is empty
based on status indicator 301B. If data is available from the MAC,
and status indicator 301B is equal to empty, data is transferred
from the MAC into the register data 301A of register 301 in block
403. Then, in block 404, the status indicator 301B of register 301
is updated to full, and the index counter 301C of register 301 is
updated to hold the highest index of the data in register data
301A. A data path includes a single admit logic implementing method
400 located between the first register of the data path and the
MAC. The admit logic may be implemented in software, hardware, or a
combination of software and hardware in various embodiments.
[0030] FIG. 5 illustrates an embodiment of a method 500 for a
transfer logic, such as transfer logic 305A or 305 B of FIG. 3.
FIG. 5 is discussed with respect to transfer logic 305A. First, in
block 501, it is determined if the register data 301A of register
301 is full based on status indicator 301B. Then, in block 502, it
is determined if the register data 302A of register 302 is empty
based on status indicator 302B. If status indicator 301B is equal
to full and status indicator 302B is equal to empty, the data
located in register data 301A is transferred to register data 302A
in block 503. Then, in block 504, the status indicator 301B of
register 301 is updated to empty, the status indicator 302B of
register 302 is updated to full, and the index counter 302C of
register 302 is updated to hold the highest index of the data in
register data 302A. Transfer logic 305B similarly transfers data
from register 302 to register 303 using method 500. In embodiments
of a data path that includes N registers, a transfer logic
implementing method 500 is located in between each set of two
adjacent registers; therefore, the total number of transfer logic
modules for the given data path is equal to the N-1. A transfer
logic may be implemented in software, hardware, or a combination of
software and hardware in various embodiments.
[0031] FIG. 6 illustrates an embodiment 600 for eject logic, such
as eject logic 306 of FIG. 3. In block 601, it is determined if the
register data 303A of the last register 303 in the data path 300 is
full based on status indicator 303B. In block 602, it is determined
if the RPP is empty based on RPP status input 308. In block 603, it
is determined if the IC 303C is less than the index of the
lowest-indexed pending data request in data request module 312
based on input from data request module 312. The determination of
block 603 prevents stream data needed by the stream processor from
being ejected from the data path 300 before it is extracted. If the
status indicator 303B is equal to is full, the RPP is empty, and
the IC 303C is less than the index of the lowest-indexed pending
data request, then, in block 604, the data located in register 303
is transferred to the RPP at output 310. In block 605, the status
indicator 303B is updated to empty. A data path includes a single
eject logic implementing method 600 located between the last
register of the data path and the RPP. The eject logic may be
implemented in software, hardware, or a combination of software and
hardware in various embodiments.
[0032] FIG. 7 illustrates an embodiment of a data extraction method
that may be implemented in data request module, such as data
request module 208 or 312. FIG. 7 is discussed with respect to FIG.
3. In block 701, the data request module 312 receives one or more
data requests from a stream processor (such as stream processor 104
of FIG. 1). It is then determined in block 702 whether the
requested data is present in any of the registers 301-303 based on
the index counters 301C, 302C, and 303C, and the status indicators
301B, 302B, and 303C (as data is only present in a register that is
full). If the requested data is found in a register, it is then
extracted from the register by a multiplexer of multiplexing unit
311 in block 703. In block 704, the extracted data is transferred
to the stream processor by the data request module 312.
[0033] FIG. 8 schematic block diagram illustrating an embodiment of
operation of a data path that includes 3 registers, 801-803. FIG. 8
is discussed with respect to FIGS. 4-7. Each row of FIG. 8
illustrates the data path at a moment in time, from T=0 to T=4. At
time T=0, each of registers 801-803 is empty. This condition may
occur, for example, at system startup, or between data streams.
Between time T=0 and time T=1, a MAC associated with the data path
receives data from a data stream (block 401), and register 801 is
empty (block 402), so the admission logic associated with register
801 admits a first set of data, containing data having indices 1-16
of a received data stream, from the MAC to register 801 (block
403). The IC of register 801 is updated to 16, as 16 is the highest
index of the data contained in register 801, and register 801's
status indicator is updated to full (block 404). At time T=1, any
data requests for any of data 1-16 may be extracted from register
801 by a multiplexing unit (not shown) associated with the data
path comprising registers 801-803, according to method 700 of FIG.
7. The number of data requests that may be fulfilled per unit of
time is dependent on the number of multiplexers in the multiplexing
unit.
[0034] Between time T=1 and time T=2, because register 801 is full
(block 501) and register 802 is empty (block 502), the transfer
logic between registers 801 and 802 transfers data 1-16 from
register 801 to 802 (block 503), updates the IC of register 802 to
16 and the status indicator of register 802 to full, and updates
the status indicator of register 801 to empty (block 504), which
triggers the admission logic associated with register 801. As the
MAC has more data available (block 401) and register 801 is empty
(block 402), the admission logic admits a second set of data,
containing data indexed at positions 16-32 in the stream, from the
MAC to register 801 (block 403). The status indicator of register
801 is updated to full, and the IC of register 801 is updated to 32
(block 404). At time T=2, any data requests for any of data 1-16
may be extracted from register 802, and any data requests for any
of data 16-32 may be extracted from register 801, according to
method 700 of FIG. 7.
[0035] Between time T=2 and time T=3, because register 802 is full
(block 501) and register 803 is empty (block 502), the transfer
logic between registers 802 and 803 transfers data 1-16 from
register 802 to 803 (block 503), updates the IC of register 803 to
16 and the status indicator of register 803 to full, and updates
the status indicator of register 802 to empty (block 504), which
triggers the transfer logic associated with register 802 and
register 801. Because register 801 is full (block 501) and register
802 is empty (block 502), the transfer logic between registers 801
and 801 transfers data 16-32 from register 801 to 802 (block 503),
updates the IC of register 803 to 32 and the status indicator of
register 802 to full, and updates the status indicator of register
801 to empty (block 504). At time T=3, the MAC has not received
additional data from the data stream, so no data is admitted from
the MAC to empty register 801 at time T=3. At time T=3, any data
requests for any of data 1-16 may be extracted from register 803,
and any data requests for any of data 16-32 may be extracted from
register 802, according to method 700 of FIG. 7.
[0036] Between time T=3 and time T=4, register 803 is full (block
601) and the RPP is empty (block 602), so the eject logic
associated with register 803 determines if there are any pending
data requests for data having an index that is less than the IC of
register 803, i.e., less than 16 (block 603). If there are no
pending data requests having an index in the data stream that is
less than 16, data 1-16 are ejected from register 803 to the RPP
(block 604), and the status indicator of register 803 is set to
empty (block 605), which triggers the transfer logic associated
with register 803 and register 802. Because register 802 is full
(block 501) and register 802 is empty (block 502), the transfer
logic between registers 802 and 803 transfers data 16-32 from
register 802 to 803 (block 503), updates the IC of register 803 to
32 and the status indicator of register 802 to full, and updates
the status indicator of register 802 to empty (block 504). At this
point, register 801 is empty, so no data transfers into register
802. However, the MAC has received more data from the data stream
(block 401) and register 801 is empty (block 402), a third set of
data including data 32-48 of the data stream is admitted to
register 801 by the admit logic (block 403). The IC associated with
register 801 is updated to 48, and the status indicator of register
801 is updated to full (block 404). At time T=4, any data requests
for any of data 16-32 may be extracted from register 803, and any
data requests for any of data 32-48 may be extracted from register
801, according to method 700 of FIG. 7. As illustrated by FIG. 8,
the streaming data is available for extraction by the multiplexers
for its entire journey through the sequence of registers 801-803,
allowing efficient recovery of data from the data stream.
[0037] FIG. 9 illustrates an example of a computing system 900
which may be utilized by exemplary embodiments of a data path for
data extraction from streaming data as embodied in software.
Various operations discussed above may utilize the capabilities of
the computer 900. One or more of the capabilities of the computer
900 may be incorporated in any element, module, application, and/or
component discussed herein.
[0038] The computer 900 includes, but is not limited to, PCs,
workstations, laptops, PDAs, palm devices, servers, storages, and
the like. Generally, in terms of hardware architecture, the
computer 900 may include one or more processors 910, memory 920,
and one or more input and/or output (I/O) devices 970 that are
communicatively coupled via a local interface (not shown). The
local interface can be, for example but not limited to, one or more
buses or other wired or wireless connections, as is known in the
art. The local interface may have additional elements, such as
controllers, buffers (caches), drivers, repeaters, and receivers,
to enable communications. Further, the local interface may include
address, control, and/or data connections to enable appropriate
communications among the aforementioned components.
[0039] The processor 910 is a hardware device for executing
software that can be stored in the memory 920. The processor 910
can be virtually any custom made or commercially available
processor, a central processing unit (CPU), a digital signal
processor (DSP), or an auxiliary processor among several processors
associated with the computer 900, and the processor 910 may be a
semiconductor based microprocessor (in the form of a microchip) or
a macroprocessor.
[0040] The memory 920 can include any one or combination of
volatile memory elements (e.g., random access memory (RAM), such as
dynamic random access memory (DRAM), static random access memory
(SRAM), etc.) and nonvolatile memory elements (e.g., ROM, erasable
programmable read only memory (EPROM), electronically erasable
programmable read only memory (EEPROM), programmable read only
memory (PROM), tape, compact disc read only memory (CD-ROM), disk,
diskette, cartridge, cassette or the like, etc.). Moreover, the
memory 920 may incorporate electronic, magnetic, optical, and/or
other types of storage media. Note that the memory 920 can have a
distributed architecture, where various components are situated
remote from one another, but can be accessed by the processor
910.
[0041] The software in the memory 920 may include one or more
separate programs, each of which comprises an ordered listing of
executable instructions for implementing logical functions. The
software in the memory 920 includes a suitable operating system
(O/S) 950, compiler 940, source code 930, and one or more
applications 960 in accordance with exemplary embodiments. As
illustrated, the application 960 comprises numerous functional
components for implementing the features and operations of the
exemplary embodiments. The application 960 of the computer 900 may
represent various applications, computational units, logic,
functional units, processes, operations, virtual entities, and/or
modules in accordance with exemplary embodiments, but the
application 960 is not meant to be a limitation.
[0042] The operating system 950 controls the execution of other
computer programs, and provides scheduling, input-output control,
file and data management, memory management, and communication
control and related services. It is contemplated by the inventors
that the application 960 for implementing exemplary embodiments may
be applicable on all commercially available operating systems.
[0043] Application 960 may be a source program, executable program
(object code), script, or any other entity comprising a set of
instructions to be performed. When a source program, then the
program is usually translated via a compiler (such as the compiler
940), assembler, interpreter, or the like, which may or may not be
included within the memory 920, so as to operate properly in
connection with the O/S 950. Furthermore, the application 960 can
be written as an object oriented programming language, which has
classes of data and methods, or a procedure programming language,
which has routines, subroutines, and/or functions, for example but
not limited to, C, C++, C#, Pascal, BASIC, API calls, HTML, XHTML,
XML, ASP scripts, FORTRAN, COBOL, Perl, Java, .NET, and the
like.
[0044] The I/O devices 970 may include input devices such as, for
example but not limited to, a mouse, keyboard, scanner, microphone,
camera, etc. Furthermore, the I/O devices 970 may also include
output devices, for example but not limited to a printer, display,
etc. Finally, the I/O devices 970 may further include devices that
communicate both inputs and outputs, for instance but not limited
to, a NIC or modulator/demodulator (for accessing remote devices,
other files, devices, systems, or a network), a radio frequency
(RF) or other transceiver, a telephonic interface, a bridge, a
router, etc. The I/O devices 970 also include components for
communicating over various networks, such as the Internet or
intranet.
[0045] If the computer 900 is a PC, workstation, intelligent device
or the like, the software in the memory 920 may further include a
basic input output system (BIOS) (omitted for simplicity). The BIOS
is a set of essential software routines that initialize and test
hardware at startup, start the O/S 950, and support the transfer of
data among the hardware devices. The BIOS is stored in some type of
read-only-memory, such as ROM, PROM, EPROM, EEPROM or the like, so
that the BIOS can be executed when the computer 900 is
activated.
[0046] When the computer 900 is in operation, the processor 910 is
configured to execute software stored within the memory 920, to
communicate data to and from the memory 920, and to generally
control operations of the computer 900 pursuant to the software.
The application 960 and the 0/S 950 are read, in whole or in part,
by the processor 910, perhaps buffered within the processor 910,
and then executed.
[0047] When the application 960 is implemented in software it
should be noted that the application 960 can be stored on virtually
any computer readable medium for use by or in connection with any
computer related system or method. In the context of this document,
a computer readable medium may be an electronic, magnetic, optical,
or other physical device or means that can contain or store a
computer program for use by or in connection with a computer
related system or method.
[0048] The application 960 can be embodied in any computer-readable
medium for use by or in connection with an instruction execution
system, apparatus, or device, such as a computer-based system,
processor-containing system, or other system that can fetch the
instructions from the instruction execution system, apparatus, or
device and execute the instructions. In the context of this
document, a "computer-readable medium" can be any means that can
store, communicate, propagate, or transport the program for use by
or in connection with the instruction execution system, apparatus,
or device. The computer readable medium can be, for example but not
limited to, an electronic, magnetic, optical, electromagnetic,
infrared, or semiconductor system, apparatus, device, or
propagation medium.
[0049] More specific examples (a nonexhaustive list) of the
computer-readable medium may include the following: an electrical
connection (electronic) having one or more wires, a portable
computer diskette (magnetic or optical), a random access memory
(RAM) (electronic), a read-only memory (ROM) (electronic), an
erasable programmable read-only memory (EPROM, EEPROM, or Flash
memory) (electronic), an optical fiber (optical), and a portable
compact disc memory (CDROM, CD R/W) (optical). Note that the
computer-readable medium could even be paper or another suitable
medium, upon which the program is printed or punched, as the
program can be electronically captured, via for instance optical
scanning of the paper or other medium, then compiled, interpreted
or otherwise processed in a suitable manner if necessary, and then
stored in a computer memory.
[0050] In exemplary embodiments, where the application 960 is
implemented in hardware, the application 960 can be implemented
with any one or a combination of the following technologies, which
are well known in the art: a discrete logic circuit(s) having logic
gates for implementing logic functions upon data signals, an
application specific integrated circuit (ASIC) having appropriate
combinational logic gates, a programmable gate array(s) (PGA), a
field programmable gate array (FPGA), etc.
[0051] The technical effects and benefits of exemplary embodiments
include processing of stream data with relatively low latency,
relatively low power consumption, and relatively small hardware
footprint.
[0052] The terminology used herein is for the purpose of describing
particular embodiments only and is not intended to be limiting of
the invention. As used herein, the singular forms "a", "an", and
"the" are intended to include the plural forms as well, unless the
context clearly indicates otherwise. It will be further understood
that the terms "comprises" and/or "comprising," when used in this
specification, specify the presence of stated features, integers,
steps, operations, elements, and/or components, but do not preclude
the presence or addition of one or more other features, integers,
steps, operations, elements, components, and/or groups thereof.
[0053] The corresponding structures, materials, acts, and
equivalents of all means or step plus function elements in the
claims below are intended to include any structure, material, or
act for performing the function in combination with other claimed
elements as specifically claimed. The description of the present
invention has been presented for purposes of illustration and
description, but is not intended to be exhaustive or limited to the
invention in the form disclosed. Many modifications and variations
will be apparent to those of ordinary skill in the art without
departing from the scope and spirit of the invention. The
embodiment was chosen and described in order to best explain the
principles of the invention and the practical application, and to
enable others of ordinary skill in the art to understand the
invention for various embodiments with various modifications as are
suited to the particular use contemplated.
* * * * *