U.S. patent application number 10/871334 was filed with the patent office on 2004-12-23 for method and apparatus for interconnection offlow-controlled communication.
This patent application is currently assigned to PMC-Sierra, Inc.. Invention is credited to Bradshaw, John Richard, Brown, Jeffery John, Loewen, Jonathan David.
Application Number | 20040257997 10/871334 |
Document ID | / |
Family ID | 32996283 |
Filed Date | 2004-12-23 |
United States Patent
Application |
20040257997 |
Kind Code |
A1 |
Loewen, Jonathan David ; et
al. |
December 23, 2004 |
Method and apparatus for interconnection offlow-controlled
communication
Abstract
A method or system or apparatus provides improved digital
communication. In one aspect, flow control in performed by
receiving status preprended to data units in a combined data
channel, where the status data indicated the available status of a
number of far end receiving channels. Thus data may be sent only to
available receiving channels. In a further aspect, a frequency
reference may also be transmitted by including data in data units
in a combined channel. In a further aspect, an active channel can
be selected among two redundant channels by use of an active bit in
said data units. The invention has particular applications to
ATM-type communication systems and may also be used in other
communication systems.
Inventors: |
Loewen, Jonathan David;
(Belcarra, CA) ; Bradshaw, John Richard; (Burnaby,
CA) ; Brown, Jeffery John; (Maple Ridge, CA) |
Correspondence
Address: |
QUINE INTELLECTUAL PROPERTY LAW GROUP, P.C.
P O BOX 458
ALAMEDA
CA
94501
US
|
Assignee: |
PMC-Sierra, Inc.
Burnaby
CA
|
Family ID: |
32996283 |
Appl. No.: |
10/871334 |
Filed: |
June 18, 2004 |
Related U.S. Patent Documents
|
|
|
|
|
|
Application
Number |
Filing Date |
Patent Number |
|
|
10871334 |
Jun 18, 2004 |
|
|
|
09574305 |
May 19, 2000 |
|
|
|
6798744 |
|
|
|
|
09574305 |
May 19, 2000 |
|
|
|
09569763 |
May 12, 2000 |
|
|
|
60134119 |
May 14, 1999 |
|
|
|
60134959 |
May 19, 1999 |
|
|
|
60136680 |
May 28, 1999 |
|
|
|
Current U.S.
Class: |
370/235 |
Current CPC
Class: |
H04L 47/283 20130101;
H04L 47/266 20130101; H04L 69/14 20130101; H04L 47/6255 20130101;
H04L 47/10 20130101; H04L 47/2441 20130101; H04L 69/22 20130101;
H04L 47/50 20130101 |
Class at
Publication: |
370/235 |
International
Class: |
H04L 012/28 |
Claims
1-19. (Cancelled).
20. A method for providing a timing reference over a serial data
unit stream comprising: determining an edge of a reference clock
signal at a transmit location; including in cells transmitted in
said serial stream a timing reference field wherein said field
value represents a number of bytes after a predetermined point at
which a timing reference edge occurs; and recovering at a receive
location said reference clock signal from said serial stream.
21. The method according to claim 20 wherein if no occurs during a
data unit, setting said timing reference field to a null value.
22. The method according to claim 20 wherein any frequency less
than the data unit rate can be transmitted.
23. The method according to claim 20 wherein a recovered clock is
generated one cell period later than the inserted timing.
24. The method according to claim 20 wherein a recovered clock has
a resolution limited to one byte and therefore some jitter is
recovered clock signal may be present.
25. The method according to claim 24 further comprising applying a
local phase locked loop as said receive location to remove
jitter.
26. The method according to claim 20 wherein every data unit
transmitted over said link contains a timing reference portion.
27-59. (Cancelled)
Description
CROSS REFERENCE TO RELATED APPLICATIONS
[0001] This application claims priority from provisional patent
application 60/134,959, filed May 19, 1999.
[0002] This application also claims priority from provisional
patent application 60/136,680 filed May 28, 1999.
[0003] This application claims priority from patent application
METHOD AND APPARATUS FOR AN OPTIMIZED DIGITAL SUBSCRIBER LOOP
ACCESS MULTIPLEXER filed May 12, 2000 (which claimed priority from
provisional application 60/134,119, filed May 14, 1999.)
[0004] Each of these applications are incorporated herein by
reference.
[0005] Co-assigned U.S. Pat. No. 5,260,978, "Synchronous Residual
Time Stamp for Timing Recovery in a Broadband Network," discusses a
number of background issues related to the present invention.
FIELD OF THE INVENTION
[0006] The present invention is related to the field of digital
communications. More specifically, the present invention is
directed to methods and/or systems and/or apparatuses for providing
enhanced flow control in digital signals.
BACKGROUND OF THE INVENTION
[0007] A number of prior art techniques have been proposed and
developed for managing traffic in computer networks using flow
control. Some references to known flow control implementations
include:
[0008] 1. ATM Generic Flow Control (GFC)--For UNI connections the
first four bits of each ATM cell have been reserved for flow
control. As in the present invention, the GFC were to communicate
transmit-on/transmit-off (XON/XOFF) information, but unlike the
present invention, this prior art flow control is applied to the
entire link as opposed to a single virtual connection (VC).
[0009] 2. BECN (Backward Explicit Congestion Notification)--This
technique is per-VC flow control employed in ATM networks. The
feedback is very slow (<10 updates per second); therefore, large
cell buffers must be used in conjunction to avoid cell loss.
[0010] 3. ATM Forum Available Bit Rate (ABR) service--Is based on
specifying the ATM cell rate, as opposed to a simple XON/XOFF
indication ABR is more suited for end-to-end flow control within
networks with large latencies. It lacks the simplicity of the
present invention.
[0011] 4. ATM Forum QFC--This is a credit based system. It lacks
the simplicity of the present invention.
[0012] 5. Transmission Control Protocol (TCP)--This is a Layer 4
end-to-end flow control (amongst other things) protocol.
[0013] 6. T1 systems--The timing is extracted from the raw frame
rate of the link. This imposes the burden that the clock for link
itself be very well controlled. With the present invention, a
suitable line rate clock can generated with a simple 3rd overtone
crystal oscillator that free runs.
[0014] 7. SONET systems--Again, synchronization is based on
extracting timing from the frame rate.
[0015] 8. SRTS--Synchronous Residual Time Stamp is a method for
timing reference carriage across an ATM network. It generates a
4-bit remainder of the running phase difference between the source
end clock and the network Stratum timing reference, transports this
value to the receiving end, which then regenerates the source end
clock from this 4-bit SRTS value and the Stratum reference.
[0016] The possibility of congestion is inherent in an access
multiplexer, such as a DSLAM. In the downstream direction, the WAN
link can generate a burst of cells for a particular modem at a rate
exceeding the modem's bandwidth capacity. Therefore, feedback to
the traffic scheduler is required to cause it to buffer and smooth
cell bursts to prevent downstream buffer overflow.
[0017] In the upstream direction, the aggregate bandwidth of all
subscribers can exceed that accommodated by the WAN uplink. Flow
control is generally required in a multiple access system to ensure
fair access to the up-link, to minimize cell or data loss, and to
minimize the impact of greedy users on others.
[0018] In DSLAM systems, such as described in greater detail in the
references cited above, flow control has been adapted for a variety
of prior architectures. One class of known prior DSLAM solutions
uses packet or cell switch architectures. This requires signaling
and traffic management functionality on the access port line cards
and on the WAN uplink port card. Additionally, intercard switching
is typically required in these solutions. Some examples of such
solutions are Transwitch-Cubit-based switch architectures,
Motorola-MPC860SAR-based architectures, and
IgT-WAC-185/186/187/188-based switch architectures.
[0019] An alternative prior solution centralizes signaling and
traffic management functionality on the WAN uplink port card by
applying a shaping function on a port basis to all traffic in the
downstream direction. This per port shaping function shapes the
aggregate traffic to a port (such as an xDSL modem) to handling
rate of that port. This solution thereby attempts to eliminate the
need for further traffic management functionality on the access
port line cards.
[0020] Various of these prior DSLAM flow control techniques suffer
from a number of disadvantages, such as:
[0021] 1. Significant increased complexity results from providing
the signaling and traffic management functionality on both the
access port line cards and the WAN uplink port card. This occurs
due to the large number of access port line cards in a typical
DSLAM, therefore requiring a large number of physical instances of
this complex and costly functionality.
[0022] 2. Placing the signaling and traffic management
functionality on each access port and WAN uplink card additionally
adds the requirement to provide intercard switching capability. The
intercard switching solution is generally complex due to the large
number of access port cards.
[0023] 3. Placing the signaling and traffic management
functionality on each access port and WAN uplink card additionally
forces a distributed software control and provisioning requirement,
thus adding significant complexity to the software layer.
[0024] 4. Traffic latency and delay variation is increased due to
the traffic transiting two traffic management structures--one on
the access port card and one on the WAN uplink card.
[0025] 5. Solutions using per port traffic shaping to eliminate the
need for traffic management functionality on the access line card,
must adjust the shaping rate in real time, each time the access
port changes rates. This will happen frequently when using rate
adaptive splitterless xDSL technology.
[0026] 6. Solutions using per port traffic shaping to eliminate the
need for traffic management functionality on the access line card
must ensure PHY buffer overflow is avoided. To do this, the shape
rate must be less then the actual PHY rate, since the two rates are
not synchronized. This rate difference represents a loss in
throughput bandwidth.
SUMMARY OF THE INVENTION
[0027] The present invention is directed to providing improved flow
control in certain digital communication environments. In various
embodiments, the present invention may be embodied in devices,
systems, and methods relating to digital communication.
[0028] In particular embodiments, the present invention may be most
easily understood in the context of a Digital Subscriber Loop
Access Multiplexer (DSLAM) architecture, such as the architecture
described in the provisional applications referenced above. In
particular embodiments, the present invention addresses issues that
can arise in DSLAM architectures where the complexity of the system
is concentrated in a few common cards, so that the multitude of
line cards each are simple. In a particular example architecture,
each xDSL signal has a relatively low bit rate and it is therefore
technically feasible to perform ATM layer functions, such as
traffic management, on a single entity. The invention, in
particular embodiments, addresses a side-effect of removing traffic
management queuing from individual line cards. This side effect is
the need to pace the transfer of cells to the line card to avoid
cell loss. The invention in specific embodiments uses a per-PHY
flow-control mechanism to achieve this.
[0029] In specific aspects of specific embodiments, the invention
provides a method of flow control in a digital communications
system wherein data flows in one direction in a combined channel
and in an opposite direction in multiple channels. In the combined
channel, according to the invention, data units (such as cells or
packets) have included in them a portion of data indicating
available/not-available status of channels in the return direction.
Before a channel in the return direction is selected for
transmitting, the available/not-available status provided in the
data portions is checked.
[0030] In further aspects, every data unit flowing in the combined
channel contains such a portion. In a further aspect, each data
unit only provides status of a subset of return channels and
therefore multiple data units are needed to update the status of
all return channels.
[0031] In a further aspect, there is a delay of one or more data
units between the status provided in the portion and when the
return scheduler receives the status portion and therefore
sufficient buffering is provided one at the on said second channels
to compensate for said delay.
[0032] In specific embodiments, portions can be encoded as a set of
bit-flags, the state of each bit flag indicating
available/not-available status of one of said channel in the return
direction.
[0033] It will thus be seen that in specific embodiments, the
invention provides a solution enabling real time PHY buffer status
feedback. This eliminates the need for per port traffic shaping on
an uplink card and further in various embodiments enables: (1) real
time, automatic adjustment to PHY rate changes, thus avoiding PHY
buffer overflow conditions which result in traffic loss and
throughput inefficiency; (2) less complex traffic management
functionality on the WAN uplink card due to removal of the per port
traffic shaping function; and (3) maximization of the PHY bandwidth
capability, since the port traffic is played out at the current
maximum PHY rate. In various specific embodiments, the invention
further: (1) is economically and technically scaleable; (2) has low
latency to the feedback to avoid instabilities or need for
extensive buffering; (3) has low bandwidth overhead for in-band and
minimizes signal paths if out-of-band: and (4) uses flow control to
avoid creating head-of-line blocking situations.
[0034] Other aspects of the present invention include a method for
providing a timing reference over a data unit stream that allows
frequency matching to any frequency less than the data unit
transmission rate.
[0035] As used herein, cells should be understood to refer to ATM
cells or to any other communications protocol data unit (such as
packets or frames) that may be transmitted or scheduled as
described by or understood from the teachings provided herein.
[0036] While the present invention is described herein in terms of
a particular ATM system embodiment, using the teachings provided
herein, it will be understood by those of skill in the art, that
various methods and apparatus of the present invention may be
advantageously used in other communication systems, including
different ATM systems and systems based on different communications
protocol, such as Ethernet, SONET, etc.
[0037] The invention will be better understood with reference to
the following drawings and detailed descriptions. In different
figures, similarly numbered items are intended to represent similar
functions within the scope of the teachings provided herein.
[0038] Furthermore, it is well known in the art that logic systems
can include a wide variety of different components and different
functions in a modular fashion. Different embodiments of a system
can include different mixtures of elements and functions and may
group various functions as parts of various elements.
[0039] For purposes of clarity, the invention is described in terms
of systems that include many different innovative components and
innovative combinations of components. No inference should be taken
to limit the invention to combinations containing all of the
innovative components listed in any illustrative embodiment in this
specification.
[0040] All publications, patents, and patent applications cited
herein are hereby incorporated by reference in their entirety for
all purposes.
BRIEF DESCRIPTION OF THE DRAWINGS
[0041] FIG. 1 shows a simple example block diagram circuit in which
a method according to the invention can be performed.
[0042] FIG. 2 shows an example format for a link data structure
according to specific embodiments of the invention.
[0043] FIG. 3 is a table illustrating an example format for
prepended filed values according to specific embodiments of the
invention.
[0044] FIG. 4 is a table illustrating an example format for
assigned bit oriented codes according to specific embodiments of
the invention.
[0045] FIG. 5 is a block diagram showing example upstream data unit
transfer architecture and cell overhead according to specific
embodiments of the invention.
[0046] FIG. 6 shows a back-pressure architecture according to
specific embodiments of the invention.
[0047] FIG. 7 is a block diagram showing example downstream data
unit transfer architecture and cell overhead according to specific
embodiments of the invention
DESCRIPTION OF SPECIFIC EMBODIMENTS
1. Flow Control
[0048] In particular embodiments, the invention is particular
concerned with flow control in a communication system. Flow
control, according to broad embodiments of the invention, can be
effectively performed on serial links, but is independent of
whether the interconnection is a parallel bus or serial link.
[0049] FIG. 1 shows a simple example block diagram circuit in which
a method according to the invention can be performed. This figure
represents a simple architecture which could employ a method
according to the invention. Shown in the figure is a scheduler 18,
a number of ports or channel terminations 14, and a multiplexing
function 16 that in both directions translates between a combined
data unit stream 15, and a plurality of parallel data streams
13.
[0050] Flow control according to the current invention can be
performed in a "downstream" direction from a high-speed scheduler
towards a number of downstream channels. In such a system, the
scheduler has to determine to which channel to send a downstream
data unit and has to be careful not to overload the capacity of the
downstream channels. According to the invention, there is provided
a combined upstream data unit stream 15 from the channels to the
scheduler. According to the invention, data units in the upstream
stream have included into them a data portion (such as a prepended
or postpended field) that indicates available/not-available status
of a number of downstream channels. This portion can be added by a
device such as multiplexer 16, which is in communication with each
of ports 14. A scheduler uses this data to determine to which
channel to send a downstream cell. Typically, this decision
consists of two steps: first any channel that is presenting a
far-end not-available status is eliminated from the scheduling
round. If all far-end channels are not available, a stuff cell may
be generated. Otherwise, a simple round robin algorithm or any
other scheduling algorithm may be used among the remaining eligible
channels to share the downstream link and schedule the next cell to
be sent.
[0051] In a further aspect, it is recognized that there is a delay
in a scheduler learning of the not-available status of a downstream
channel. Therefore, according to the invention, downstream channels
are designed able to handle a number of data units after a
not-available status is signaled. For example, a downstream channel
may signal not-available whenever it has only three free data unit
buffers remaining, thus the channel is able to accept three data
units after signaling a not-available status.
2. Timing Reference
[0052] In a further aspect, the present invention provides a means
for carrying a timing reference frequency in a serial data unit
stream. This frequency reference transport mechanism is believed to
be unique in that it operates independently of the serial link bit
rate. According to the invention, the timing reference is carried
inband to avoid the need for additional interconnection for the
function.
[0053] Transmitting a reference frequency is known in the art and
generally referred to as "network synchronization." As will be
recognized in the art, digital communication networks of many
designs operate optimally when all the entities in the network have
a bit rate that is derived from one source. This aspect of the
present invention allows a reference frequency (generally provided
by an external signal) to be propagated in a data stream to
different components in a communication system.
[0054] According to this embodiment, data units transmitted over a
serial link contains a timing reference field (TRF). Although in
one embodiment, the timing reference is targeted at a typical need
of transporting an 8 kHz signal, any frequency less than the cell
rate is permissible.
[0055] The TRF in a cell indicates during which byte reception in
that cell a next edge occurs in the timing reference signal. In
specific embodiments, the TRF value indicates the number of bytes
relative to the start of the cell wherein the edge occurs. (A flag
value, such as all ones, indicates no edge occurs during a
particular data unit.) Thus a circuit or module receiving a byte
stream of protocol data units can recreate a reference
frequency.
[0056] In particular embodiments, the recovered timing event at the
receiving is generated one cell period later than it was inserted
at the transmitting end, with a resolution of one byte. Because of
the limited resolution, some jitter is present in the recovered
reference frequency. In particular embodiments, an external local
high-Q phase locked loop (PLL) can be used to remove the
jitter.
[0057] Aspects of the present invention have thus far been
described in terms of general methods and devices. The previous
description and claims are believed to be a full and complete
description sufficient to allow an ordinary practitioner in the art
to make and use the invention as described. It will be understood
to those of skill in the art from the teachings provided herein
that the described invention can be implemented in a wide variety
of specific programming environments and communications systems
using a wide variety of programming languages (such as SQL, Visual
Basic, Pascal, C++, Basic, Java, etc.) and wide variety of file or
communications formats.
[0058] What follows are descriptions of example systems and methods
that embody various aspects of the present invention and that
include additional novel aspects. This following discussion is
included, in part, in order to disclose particularly preferred
modes presently contemplated for practicing the invention. It is
intended, however, that the previous discussion and the claims not
be limited by examples provided herein. It is further intended that
the attached claims be read broadly in light of the teachings
provided herein. Where specific examples are described in detail,
no inference should be drawn to exclude other examples known in the
are or to exclude examples described or mentioned briefly from the
broad description of the invention or the language of the claims.
It is therefore intended that the invention not be limited except
as provided in the attached claims and equivalents thereof.
3. Example System Implementation
[0059] In specific aspects, the present invention may be further
understood in the context of operation of a specific architecture
for digital communications. This example architecture is described
in above referenced applications and is referred to at times as
embodied in the assignee's products VORTEX.TM., DUPLEX.TM., and
APEX.TM., which are particular logic implementations that may
include aspects of the present invention. However, the invention
has applications in various communication system architectures and
is intended to not been limited except as provided by the attached
claims. To facilitate understanding, the following discussion
includes many details of operation of a particular example VORTEX
and DUPLEX modules. These details are not elements of all
embodiments of the present invention, which should not be limited
except as provided in the attached claims.
[0060] As described in greater detail in above referenced
documents, a particular DSLAM architecture provides interconnection
between line cards and an uplink processor via serial links.
(Serial, as used herein and as commonly understood in the ATM art,
can include multi-line serial connections, such as a four-wire
serial ATM cell connection. In this usage, serial refers to the
transmission of cells or portions of cells one after the other,
even if some bits are transferred in parallel to improve speed. In
a particular embodiment, the serial connections comprise two
twisted pair connections, one in the upstream direction and one in
the downstream direction, each twisted pair transmitting a serial
bit-stream as understood in the art.
[0061] Data to and from the line cards are transferred on these
high-speed links. (In an example embodiment, serial link
transceivers support UTP-5 cable lengths up to 10 meters.)
Typically, to avoid clock skew issues, a separate clock is not
transmitted and the receivers recover a local clock from the
incoming data. In a system according to the invention, the serial
links typically carry data units (such as ATM cells or other
addressed packets, such as Ethernet packets) with prepended bytes.
An example cell format is illustrated in FIG. 2. A WAN uplink
processor (such as in VORTEX) appends the first four bytes and a
Header Check Sequence (HCS) byte in the downstream direction and
strips them off and parses them in the upstream direction. With
respect to the VORTEX and DUPLEX, the remainder of the bytes in the
data structure may be transferred transparently. In a specific
embodiment, the bytes are serialized most significant bit first,
the bit stream is a simple concatenation of the extended cells, and
cell rate decoupling is accomplished through introduction of stuff
cells.
[0062] In a particular implementation and embodiment, the
transmitter inserts a correct CRC-8 that protects both the ATM cell
header and prepended bytes in the HCS byte. The receiver uses the
HCS byte for delineation. Failure to establish cell alignment
results in a loss of cell delineation (LCD) alarm. The entire bit
stream is scrambled with a x.sup.43+1 self-synchronous
scrambler.
[0063] FIG. 2 shows an example of a High-Speed Serial Link Data
Structure that may be employed according to an embodiment of the
invention. FIG. 3 is a table illustrating an example format for
prepended filed values according to specific embodiments of the
invention.
[0064] FIG. 5 is a block diagram showing example upstream data unit
transfer architecture and cell overhead according to specific
embodiments of the invention. As shown in the figure, in the
upstream direction, line cards 100 add header information to cells
by operation of DUPLEX 400. This overhead is used at the VORTEX 300
to perform downstream scheduling. The VORTEX adds additional
overhead to upstream cells, which may be used in further upstream
processing.
[0065] FIG. 6 shows a back-pressure architecture according to
specific embodiments of the invention. This back-pressure is
transmitted to APEX 200, by VORTEX 300, and is exerted through the
operation of the status data in the VCI/VPI prepend transmitted in
the upstream cells.
[0066] FIG. 7 is a block diagram showing example downstream data
unit transfer architecture and cell overhead according to specific
embodiments of the invention As shown in the figure, the various
Vortex ID, Link ID, and PHY ID are decoded by the VORTEX and the
cells are placed by the Vortex on the appropriate serial
connections to the line cards.
[0067] 3.1. Link Integrity Monitoring
[0068] Although the serial link bit error rate can be inferred from
the accumulated Header Check Sequence (HCS) errors, the option
exists to perform error monitoring over the entire bit stream. When
the feature is enabled the second User Prepend byte transmitted
shall be overwritten by the CRC-8 syndrome for the preceding cell.
The encoding is valid for all cells, including stuff cells. The
CRC-8 polynomial is x.sup.8+x.sup.2+x+1. The receiver generally
raises a maskable interrupt and optionally increments the HCS error
count. Simultaneous HCS and cell CRC-8 errors result in a single
increment.
[0069] 3.2. Bit Oriented Codes
[0070] A system according to the invention in specific embodiments
uses Bit-Oriented Codes (BOCs). Such coding was previously known in
TI transmission systems and is described in ANSI T1-403-1995. This
function was implemented with circuits (i.e. XBOC and RBOC)
designed in approximately 1989. However, according to the current
invention, BOC functionality is adapted to DSLAM-type multiplexing
and to an ATM environment.
[0071] In a particular embodiment, Bit Oriented Codes (BOCs) are
carried in the BOC bit position in the System Prepend. The 63
possible codes can be used to carry predefined or user defined
signaling or control status. Bit oriented codes can be transmitted
as a repeating 16-bit sequence consisting of 8 ones, a zero, 6 code
bits, and a trailing zero (111111110xxxxxx0). The code to be
transmitted is programmed by writing the Transmit Bit Oriented Code
register. The autonomously generated Remote Defect Indication (RDI)
code, which is generated upon a loss-of-signal or
loss-of-cell-delineation, takes precedence over the programmed
code. RDI insertion can be disabled via the RDIDIS bit of the
Serial Link Maintenance register. RDI can be inserted manually by
setting the Transmit Bit Oriented Code register to all zeros.
[0072] A receiver can be enabled to declare a received code valid
if it has been observed for 8 out of 10 times or for 4 out of 5
times, as specified by the AVC bit in the Bit Oriented Code
Receiver Enable register. Unless fast declaration is necessary, it
is recommended that the AVC bit be set to logic 0 to improve bit
error tolerance. Valid BOC are indicated through the Receive Bit
Oriented Code Status register. The BOC bits are set to all ones
(111111) if no valid code has been detected A maskable interrupt is
generated to signal when a detected code has been validated, or
optionally, when a valid code goes away (i.e. the BOC bits go to
all ones). When the receiver is out of cell delineation (OCD) and
the Receive Bit Oriented Code Status register will produce all ones
(111111).
[0073] Valid codes in one specific embodiment are provided in the
table shown in FIG. 4. Reserved codes anticipate future enhanced
feature set devices. User Defined codes may be used without
restriction. Regardless of definition, all 63 codes may be
validated and read by a microprocessor.
[0074] 3.3. Loop Back
[0075] The RXDn.+-. data is looped back onto TXDn.+-. at the end of
the reception of a loopback activate code. For the loopback to be
enabled, the loopback code is first validated (received 8 out of 10
times at least once) and then invalidated, typical by reception of
another code. The loopback is not enable upon initial validation of
the loopback activate code because the looped back signal, which
still contains the original loopback activate command, would cause
the far-end receiver to go into metallic loopback, thereby forming
an undesirable closed loop condition. The loopback is cleared
immediately upon the validation of the loopback deactivate code,
assuming the MLB register bit is logic 0.
[0076] To produce a loopback at the far end, one programs the
Transmit Bit Oriented Code register with the loopback activate code
for at least 1 ms and then reverts to another (typically idle)
code. Upon termination of the loopback activate code, the data
transmitted on TXDn.+-. is expected to be received verbatim on the
RXDn.+-. inputs. When transmitting a loopback activate code, it is
recommended the RDIDIS register bit be set to logic 1, or else a
loss-of-signal or loss-of-cell-delineation event, would cause a
premature loopback due to a pre-emptive Remote Defect Indication
(RDI) code being sent.
[0077] The remote reset activate and deactivate code words are
supported by a line card device (DUPLEX). The uplink can send the
reset activate code to cause the line card device to assert its
active low RSTOB output. The deactivate code causes deassertion of
RSTOB.
[0078] The Remote Defect Indication (RDI) is sent whenever Loss of
Signal (LOS) or Loss of Cell Delineation (LCD) is declared. This
code word takes precedence over all others.
[0079] 3.4. Cell Delineation Process
[0080] As described in the above-referenced documents, the VORTEX
performs HCS cell delineation, payload descrambling, idle cell
filtering and header error detection to recover valid cells from
the receive high-speed links. These functions are performed in the
spirit of ITU-T Recommendation I.432.1, but support 9 to 13 byte
cell headers. For additional information, refer to the above
referenced documents.
[0081] 3.5. Protection Switching Protocol (Fault Tolerance
Architectural Elements)
[0082] In a further aspect, the VORTEX and its sister device, the
DUPLEX inherently support system architectures requiring fault
tolerance and 1:1 redundancy of the system's common equipment. In
point-to-point backplane architectures such as these, the 1:1
protection also includes the associated serial links (also referred
to as Low Voltage Differential Signal or LVDS) connecting the
common equipment to the line cards. VORTEX and DUPLEX perform clock
recovery, cell delineation, and header error monitoring for all
receive high-speed serial links simultaneously. The maintained
error counts and alarm status indications may be used by the
control system to determine the state and viability of each serial
link.
[0083] In these architectures, a DUPLEX will be connected to two
VORTEXs, one on the active common card and one on the spare common
card. Upon a failure of the active card, the spare card becomes the
conduit for traffic. The VORTEX facilitates link selection upon
start-up as well as switching between links upon failure
conditions.
[0084] Typically a centralized resource or cooperating distributed
microprocessor subsystems will determine which common card is to be
considered active for each downstream DUPLEX. One key to link
selection lies in how the "ACTIVE" bit is handled by the VORTEX and
DUPLEX. The control system uses the ACTIVE bit within each of the 8
Serial Link Maintenance registers to independently set the state of
each link's ACTIVE status. The current state of the link's ACTIVE
bit is sent downstream once per transmitted cell. The ACTIVE status
is debounced and acted upon by the DUPLEX. The DUPLEX will only
accept data traffic from one of its two serial links, and normally
it is the link marked ACTIVE that is considered to be the working
link. However, the DUPLEX can override this using local control.
Thus, although the VORTEX may indicate the ACTIVE and spare links,
it is actually the DUPLEX that must effect the protection
switching.
[0085] The DUPLEX reflects back an ACTIVE bit status to indicate
the link chosen as active. This reflected ACTIVE bit does not have
a direct affect on the VORTEX but its status is debounced (must
remain the same for 3 received cells) and then stored by the VORTEX
in the Receive High-Speed Serial Cell Filtering
Configuration/Status register. The reflected status can be used by
the local control system to confirm receipt of the ACTIVE status by
the DUPLEX.
4. Data Buffering and Flow Control
[0086] Upstream and downstream flow control according to the
present invention (and also flow control as implemented in a
particular version of the VORTEX) can be further understood in the
context of an overall system, including the role played by the
line-card devices (such as eight DUPLEX devices in a particular
implementation) connected to the uplink. Portions of VORTEX &
DUPLEX TECHNICAL OVERVIEW are reproduced below. For additional
information, the reader is referred to above-referenced documents
and to PMC-Sierra's VORTEX & DUPLEX TECHNICAL OVERVIEW.
[0087] Below is further described cell buffering and flow control
aspects according to specific embodiments of the present invention,
using as an example elements of a particular implementation of
VORTEX and DUPLEX logic. While this specific example implementation
uses serial links, flow control, according to broad embodiments of
the invention, is independent of whether the interconnection
between the line cards and uplink cards is a parallel bus or serial
link and only presumes an upstream data unit stream. In specific
embodiments, aspects of the invention are optimized to use serial
links. In a particular VORTEX and DUPLEX, serial links were chosen
because they are more scaleable and reliable than parallel
buses.
[0088] 4.1. Downstream Traffic Flow Control
[0089] A particular embodiment of the VORTEX has 33 one-cell-deep
buffers for each of the 8 downstream serial links. Below is
described how, on a per link basis, the VORTEX schedules cells out
of these 33 cell buffers and transmits them on their serial link.
An individual link is described here, but the reader is reminded
that in one embodiment of the invention, there is no scheduling
interaction or interdependence among the 8 serial links--each has
its own 33 cell buffer and each has its own scheduler. Therefore,
extending the described method to multiple links will be apparent
from the teachings provided herein.
[0090] Downstream scheduling occurs when the previous cell has been
fully transmitted over the downstream link. In other words, once a
cell (data or stuff cell) has been scheduled the entire cell is
sent before another cell is scheduled. Cells are transmitted back
to back, without a substantial gap. The scheduling event according
to the invention happens quickly enough that there is no added cell
transmission delay. When there is no buffered data in any of the 33
buffers for a particular serial link, the VORTEX generates a stuff
cell and sends it on the link. A stuff cell meets all the
requirements of a standard data cell, including valid system
overhead information, but stuff cells are discarded by the far-end
receiver.
[0091] When there are one or more non-empty buffers for a link, a
scheduler (in VORTEX, sometimes referred to as a core card
scheduler, or uplink scheduler) must decide which of the far-end
channels (in the example, up to 32 PHYs or ports plus the
microprocessor port) should have its buffered cell scheduled onto
the downstream link. This decision consists of two steps: first any
channel that is presenting a far-end buffer full status (one
example of how this may be done is described below) is eliminated
from this scheduling round If all far-end channels have full
buffers, a stuff cell is generated and transmitted. Otherwise, a
round robin algorithm is used (though any other access algorithm
could be used) among the remaining eligible channels to share the
downstream link fairly and schedule the next cell to be sent.
[0092] 4.1.1. Upstream Signaling of Downstream Status
[0093] In a specific implementation, as shown in FIG. 3, each cell
transmitted over each of the upstream serial links contains a
portion (in one example, 16 bits, labeled CA[15:0]) of information
that convey the far-end cell buffer status (full or not full) for
16 of the active PHYs (there are generally a total of 32 for each
serial link) supported on each link. Therefore, in this embodiment,
after two cells are received on the upstream link the downstream
buffer status of all 32 far-end PHYs has been updated. In a
specific embodiment, a separate overhead bit per cell conveys the
buffer status of the far-end microprocessor port. In this
embodiment, therefore, at any given instant a (VORTEX) scheduler is
using information that is either one or two cells out of date.
Therefore, a far-end device (typically the DUPLEX) generally will
have enough per-PHY buffer space to accommodate the slight delay in
conveying the "buffer full" information to the scheduler. The
scheduler uses the full or not full information to determine which
channels should be involved in the current round of scheduling, as
discussed above.
[0094] 4.2. Upstream Traffic Flow Control
[0095] The upstream traffic flow control within the VORTEX allows
for some system engineering flexibility. When the system is
engineered such that maximum aggregate burst upstream bandwidth is
less than or equal to the link and device bandwidth at each stage
of concentration, congestion will not occur prior to upstream
traffic queuing in the TM (traffic manager) device. (Upstream
queues could congest due to restricted up-link capacity, in which
case appropriate congestion management algorithms within the TM
device should be invoked.) In this case, upstream traffic flow
control is unnecessary and need not be utilized in DUPLEX or VORTEX
type devices.
[0096] However, when a system is engineered such that upstream
burst bandwidth capacity can exceed the link and bus bandwidth then
(depending on parameters such as the over subscription employed,
misbehaving users, or traffic burst scenarios) congestion at the
upstream uplink (e.g. VORTEX) buffers can occur. To ensure that
these buffers do not overflow, upstream traffic flow control is
implemented according to further aspects of the invention (in a
specific implementation by the VORTEX and DUPLEX).
[0097] Unlike the downstream direction, the upstream direction does
not require per channel buffering or per channel buffer status
indication In the VORTEX, each of the (eight) upstream serial links
is provided with a simple six cell FIFO. The SCI-PHY/Any-PHY bus
slave state machine services the FIFOs with a weighted round-robin
algorithm and presents the data to the upstream bus master as a
single cell stream.
[0098] In aggregate, the 8 upstream links can burst data into the
VORTEX at up to 1.6 Gb/s, which is twice the maximum bandwidth of
the upstream bus. Further, the bus master may be servicing several
VORTEX devices at once or be otherwise restricted in the maximum
sustained bandwidth it is able to receive from the VORTEX.
Therefore, the potential to overflow one or more of the 6 cell
upstream FIFOs is a real possibility.
[0099] Therefore, when any upstream FIFO has less than three empty
cell buffers, it deasserts the cell available (CA[0]) bit sent in
the system overhead of the corresponding downstream serial link
(see FIG. 3). It is the responsibility of the far end device
(typically a DUPLEX) to start sending stuff cells immediately upon
indication that the VORTEX can accept no more traffic. By setting
the full mark at three cells, the VORTEX allows for two additional
cells to be accepted after CA[0] is deasserted. This accommodates
far-end latency in reaction to the CA [0] indication
[0100] 4.3. Timing Reference Insertion and Recover
[0101] In a further aspect, the system design provides a means for
carrying a timing reference frequency in a serial data unit stream
This frequency reference transport mechanism is believed to be
unique in that it operates independently of the serial link bit
rate. According to the invention, the timing reference is carried
inband to avoid the need for additional interconnection for the
function.
[0102] The reason for wanting to carry a reference frequency is
known in the art and generally referred to as network
synchronization. As will be recognized in the art, digital
communication networks of particular design operate optimally when
all the entities in the network have a bit rate that is derived
from one source. This aspect of the present invention allows a
reference frequency (generally provided by an external signal) to
be propagated in a data stream to different components in a
communication system.
[0103] The high-speed serial links are capable of transporting a
timing reference in both directions, independent of the serial bit
rate. As shown in FIG. 3, every cell transmitted over the serial
link contains a timing reference field called TREF[5:0]. Although
the timing reference is targeted at a typical need of transporting
an 8 kHz signal, its frequency is not constrained to 8 kHz. Any
frequency less than the cell rate is permissible.
[0104] In the transmit direction, rising edges on a TX8K input (an
externally provided reference frequency) are encoded in the cells
transmitted on all eight serial links. For each of the serial
links, the rising edge of TX8K causes an internal counter to be
initialized to the cell length minus 1. The counter decrements with
each subsequent byte transmitted until the fourth byte of the next
cell with prepend, at which point the state of the counter is
written into the outgoing TREF[5:0] field. If no rising edge on
TX8K has occurred, TREF[5:0] is set to all ones.
[0105] In the receive direction, the VORTEX is typically receiving
cells from a DUPLEX device, which implements the same TX8K process
described above. As determined by the value of the RX8KSEL[2:0]
bits in a Master Configuration register (this value indicated the
serial link from which a timing reference will be recovered), the
timing signal received over one of the eight serial links is
recreated on RX8K, which may be output by the VORTEX to other
system components.
[0106] The VORTEX monitors the TREF[5:0] field on the selected
upstream serial link and initializes an internal counter to the
value of TREF[5:0] each time the field is received. The counter
decrements with each subsequent byte received. When the count
becomes zero, a rising edge is generated on RX8K. If the value of
TREF[5:0] is all ones, RX8K remains low. RX8K is left asserted for
two high speed (REFCLK) reference clock periods, and then it is
deasserted.
[0107] The recovered timing event is generated one cell period
later than the inserted timing with a resolution of one byte.
Because of the limited resolution, some jitter is present. At a
link rate of 155.52 Mb/s, 52 ns of peak-to-peak jitter will occur
on RX8K. An external local high-Q phase locked loop (PLL) can be
used to remove the jitter.
5. Further Details of Operation of Duplex and Vortex
[0108] This section describes in further detail how a VORTEX and
DUPLEX can implement a flow controlled data path according to
specific embodiments of the present invention, between the line
cards and the core card(s).
[0109] 5.1. Multiplexing on the Line Card (Stage 1
Multiplexing)
[0110] As long as the PHY devices are Utopia L2 compliant, there
will be no glue logic or external circuitry needed to interface the
DUPLEX to the logical PHY devices. Using the SCI-PHY Utopia
extension (as defined by PMC-Sierra) up to 32 PHY devices can be
served. Because the DUPLEX is a mixed analog/digital device with
integrated clock synthesis unit (CSU) and clock recovery unit (CRU)
there is no external circuitry needed to support the high speed
serial interface between the DUPLEX and VORTEX. In this way, parts
count and board area on the line card is kept to an absolute
minimum.
[0111] 5.1.1. Upstream (to the Core Card) Traffic Control on the
Line Card
[0112] A single parallel data bus (e.g. Utopia L2) is used to
connect all PHY or multi-PHY devices to the DUPLEX. The PHYs are
slave devices, the DUPLEX is the Utopia bus master. The bus is
normally 8 bits wide with a maximum clock rate of 25 MHz (i.e.
maximum 200 Mbps bus bandwidth), although a 16-bit, 33 MHz bus is
also supported.
[0113] As bus master, the DUPLEX continuously polls the Receive
Cell Available (RCA) status lines of each PHY, reading cells from
the PHYs as they become available. Once a cell is brought into the
DUPLEX the cell is tagged (via a prepend byte) with the appropriate
PHY ID (0:31) and sent simultaneously to both core cards (active
and standby) over the DUPLEX's two high speed serial links.
[0114] For the applications being targeted by this architecture,
the 200 Mbps bandwidth capacity of the Utopia bus and the serial
link is greater than the aggregate maximum upstream bandwidth of
the PHYs. (For example, a line card with 32 E1 ports would, worse
case, cause an upstream burst of only 64 Mbps. For DSLAM
applications based on ADSL modems, the worse case upstream
bandwidth requirement is even less.) Hence, there is no need for
extensive buffering in the DUPLEX. (Though in one implementation,
the DUPLEX does have minimal internal cell buffer in the upstream
direction. This buffering improves throughput by ensuring that cell
transfers from the Utopia bus to the serial link can occur
back-to-back without idle periods on the link.) The internal four
cell FIFO required on all Utopia compliant PHYs ensures that each
modem can buffer cells while it waits for the DUPLEX to service
it.
[0115] Since the serial link to the core card is not shared by any
other line cards there is no need to conserve bandwidth by
implementing port-to-port switching on the line card. Leaving all
switching functions on the core card greatly simplifies the
hardware and software requirements of the line card. This is one of
the fundamental advantages of using a point-to-point interface
between the line card and core card rather than, for example, a
shared bus architecture where multiple line cards share a high
speed bus.
[0116] In summary, in particular embodiments, the upstream data
path on the line card operates according to the following:
[0117] 1. Upstream cells are received by the PHY device and
buffered until the DUPLEX reads them;
[0118] 2. The connection between the PHYs and the DUPLEX may be a
standard connection, such as Utopia L2;
[0119] 3. PHYs operate as bus slaves, the DUPLEX operates as bus
master;
[0120] 4. After a cell is transferred from the PHY to the DUPLEX,
it is tagged in the DUPLEX with a PHY ID and is sent to the active
(and simultaneously to the standby core card where it is present)
via point-to-point serial links.
[0121] 5.1.2. Downstream (to the Line Card) Traffic Control on the
Line Card
[0122] Downstream traffic is sent to the line card from the core
card over the serial link. Where a redundant link is present, a
line card (DUPLEX) will only accept cells from the serial link that
has identified itself as "active" via an embedded control signal,
though user cells on the inactive link are monitored for errors and
then discarded and any embedded control channel cells received on
the inactive link are sent to the microprocessor port. This allows
the core cards to control when protection switching between serial
links (and hence core cards) occurs. (If a distributed control
architecture is desired, the line card's microprocessor can
override the active channel selection via a control register.)
[0123] All downstream cells will have been tagged on-the core card
(by the ATM layer device) with a prepend (0:31) that identifies for
which PHY the cell is tagged. For each cell it receives, the DUPLEX
strips off the prepend, leaving just the original cell. It
temporarily buffers the cell in a shallow (in one example, four
cells per PHY) internal FIFO. This creates full "cell rate
decoupling" between the Utopia bus and the serial link. Therefore
the bus and the link can be clocked asynchronously.
[0124] As bus master, the DUPLEX continuously polls a Transmit Cell
Available (TCA) status lines of all PHYs for which it has a
downstream cell buffered. When a PHY has room (e.g. when the TCA is
asserted), the DUPLEX sends the next buffered cell to the PHY (in a
specific implementation over the Utopia bus).
[0125] To prevent its internal buffers from overflowing the DUPLEX
implements per-PHY back-pressure flow control, as described above.
Back-pressure signaling is sent to the VORTEX via embedded overhead
on the upstream serial link. The DUPLEX indicates back-pressure for
a PHY if the corresponding internal FIFO is holding two or more
cells. Because buffering and back-pressure are implemented
independently on a per-PHY basis, no head of line blocking
occurs.
[0126] In summary, in particular embodiments, the downstream data
path on the line card operates according to the following:
[0127] 1. The DUPLEX receives downstream cells on the active serial
link, strips off the PHY ID and places the cells in an internal
per-PHY buffer.
[0128] 2. The DUPLEX sends per-PHY back-pressure to the VORTEX via
the serial link whenever the corresponding internal buffer begins
to fill.
[0129] 3. For each PHY with a non-empty buffer, the DUPLEX
continuously polls the PHY's TCA line and sends a cell over the
Utopia bus once the PHYs TCA status is asserted (i.e. when the PHY
has room in its internal transmit cell buffer).
[0130] 5.2. Connecting Line and Core Cards (Stage 2
Multiplexing)
[0131] In a particular embodiment, inter-card communication between
the line cards and the core cards is carried on the serial links
(physically, 4-wire high-speed links in one embodiment). No
additional wires, clocks, or signals are required between cards.
Physically, the serial transceivers are designed to connect
directly to backplane traces or 4-wire (preferably shielded twisted
pair) cables up to 10 meters in length. No external drivers are
required. The high speed internal transmit clock of the serial link
is synthesized from a lower (1/8) speed reference clock. The serial
receiver recovers its clock from the incoming data, so both the Tx
and Rx clocking of the serial links can be fully asynchronous to
the clocking of the Utopia bus.
[0132] This section discusses the data path and flow control
aspects of the inter-card communications; later sections discuss
the inter-card communication channel, clock, timing, and other
signals that are carried by the 4-wire serial connection.
[0133] The serial links carry user cells "clear channel" by
appending extra bytes to each cell in order to carry all system
information. Idle (or stuff) cells are automatically injected when
no user data is present to ensure that continuous link error
monitoring is available. Loss of receive signal (LOS) and loss of
frame (LOF) interrupts are provided, as is far end notification of
LOS and LOF conditions. Error cell and user data cell counters are
also provided to assist in link performance monitoring.
[0134] In systems with redundant elements, each DUPLEX will be
connected to two VORTEX devices, one on the active core card and
one on the standby core card. Each VORTEX is capable of terminating
8 serial links from 8 DUPLEX devices. Each of the 8 links is
independently configurable as active or standby, so a core card can
simultaneously act as the active path for one line card but the
inactive path for another line card. This allows load sharing
between the two core cards and their associated WAN up-links. Load
sharing allows each of the two WAN up-links to, under normal
operating conditions, carry fill traffic loads. Under failure
conditions the total up-link bandwidth would be reduced by 50%.
[0135] The serial links have been designed to support the removal
and insertion of cards while the equipment is powered up and
carrying traffic. This, for example, allows all traffic to be moved
to one core card while the second, standby core card is upgraded or
serviced. This "hot swap" capability is a one feature of the
point-to-point interconnect architecture.
[0136] For systems with greater than 8 line cards, several VORTEX
devices may be placed on each core card. They share a 16 bit wide,
50 MHz ANY-PHY bus with the ATM layer devices such as the APEX and
the ATLAS described in the above cited references. The VORTEX
devices are bus slaves on this bus. To the bus master, each VORTEX
looks like a multi-PHY device supporting up to 264 logical PHYs (8
links times 32 PHYs per link, plus 8 control channels, one per
link).
[0137] 5.2.1. Upstream (to the Core Card) Traffic Flow Control on
the Core Card
[0138] In a typical access concentrator the WAN up-link will
operate at an OC-3 rate or below (i.e. <155 Mbps) while the
aggregate upstream burst bandwidth of the PHYs can be much higher.
In order to smooth out the upstream traffic bursts with no or
minimal loss of traffic there must be buffering and traffic
management somewhere in the access concentrator. This is typically
handled by a traffic management device such as an APEX.
[0139] Various Options
[0140] The issue of where upstream traffic buffers are placed in
the system architecture is an important one that deserves further
discussion. In its simplest form, the system designer of an access
multiplexer or switch is faced with three choices: put buffering on
the line card and only pull upstream traffic off each line card
when the WAN up-link can take it, or pull the upstream traffic off
the line card immediately and buffer it on the core card, such as
done by the VORTEX and DUPLEX, or put buffering on every card
(often done when there is a separate switch fabric between cards).
The first approach requires significant over-engineering of the
total amount of buffer space required system-wide and drives these
costs onto the line card. It cannot take advantage of the
statistical gain made by accumulating traffic bursts across a large
number of line cards. As was mentioned previously, in access
applications each line may be idle for significant portions of the
time. Further, access speeds vary widely depending on the services
being offered. Rate adaptive services such as ADSL vary their speed
based the loop conditions of individual customers. Taken together,
this wide degree of per line variability lends itself to
significant statistical gain if the traffic buffering is
centralized. As well, under extreme load conditions--situations
where upstream traffic must be discarded intelligently--it is very
difficult to design the system so that traffic discard is handled
optimally and fairly across all line cards. This is especially true
when QoS (Quality of Service) issues need to be addressed.
[0141] The second approach eliminates the buffers on the line card,
(Except possibly for the very small internal buffers used by the
VORTEX and DUPLEX to keep the serial links operating efficiently
with back-to-back transfers) but there are costs associated with
this approach. The second approach requires that traffic be moved
to its buffering point before PHY buffer overflow occurs on the
line card. Hence the entire upstream data-path from PHY to first
significant buffering point (i.e. the ATM traffic management
device) must have sufficient capacity to ensure that it isn't a
bottleneck.
[0142] For a particular VORTEX and DUPLEX architecture, a good rule
of thumb is that the maximum aggregate upstream bandwidth of the
PHYs on a single line card should be less than 155 Mbps. This
eliminates the individual serial links as potential bottlenecks. On
the core card the ANY-PHY bus sits between the VORTEXs and the ATM
layer, so the 800 Mbps bus speed must also be taken into
consideration. Hence another rule of thumb is that the maximum
sustained upstream data rate, when taken in aggregate from all
active PHYs, should be less than 800 Mbps. Since the WAN up-link is
typically OC-3 or less in speed this is not a significant
restriction. Note also that systems implemented with balanced load
sharing between duplicated core cards can essentially double the
upstream buffering capacity and the aggregate upstream burst
tolerance of the system.
[0143] In very large access multiplexers such as DSLAMs one
normally assumes that not every PHY is active at all times.
However, in theory it may be physically possible for the aggregate
upstream burst bandwidth to exceed the 800 Mbps bus capacity unless
steps are taken to prevent this error condition. It is impractical
to sustain 800 Mbps bursting for even a short period of time due to
the massive amount of cell buffering that would be required to
buffer cells while waiting for a DS-3 or OC-3 WAN up-link to clear
the traffic. Hence the multiplexer's call setup software, also
known as the Connection Admission Control (CAC) software, will need
to prevent potential upstream traffic overflow by simply refusing
additional connections if the multiplexer becomes overloaded.
Having discussed why centralized buffering is the preferred
approach, this discussion will now proceed to a description of the
second stage of multiplexing for upstream traffic.
[0144] 5.2.2. Option Selection
[0145] As described, the DUPLEX multiplexes all upstream traffic
into a 200 Mbps serial stream and sends it simultaneously to the
active and inactive core cards (The term "active core card" is used
rather loosely here since load balancing designs do not have a
strictly active and strictly inactive card. However, for
simplicity, the discussion will continue to use the term "active
core card" rather the longer but more accurate "core card active
for the line card being discussed") via the 4-wire serial links.
The VORTEX on the active core card terminates the serial link,
monitors/counts transmission errors, discards idle cells, extracts
the user cells, and places them on the core card's parallel ANY-PHY
bus as described.
[0146] The VORTEX schedules upstream traffic from its eight serial
links into the ANY-PHY bus. It implements a user programmable
weighted round robin polling scheme. The per-link weights can
usually be left to their default status, which is equal weight on
all links. However, a higher weigh may be desirable on a specific
link if, for example, the link is being used to connect to a
co-processor card or perhaps a high speed PHY card that will be
generating a large amount of traffic relative to the other line
cards. Weights are relative and linear; a link with a weight two
times another will be polled twice as often.
[0147] Each VORTEX has a raw upstream bandwidth capability of 1.6
Gbps (8 links at 200 Mbps each) while the core card ANY-PHY bus is
typically a 800 Mbps bus (16 bit wide 50 MHz parallel bus).
Therefore the VORTEX implements back-pressure on the serial links
to prevent the upstream traffic from overflowing at the parallel
bus. Each serial link is provided a small internal receive cell
buffer. As the receive buffer approaches the full mark the VORTEX
sends a back-pressure indication to the DUPLEX via the embedded
system overhead channel in the downstream serial link The DUPLEX,
after receiving the buffer full indication, immediately begins
sending idle cells until the back-pressure indication is deasserted
by the VORTEX. (In the upstream direction it is sufficient to use a
single back-pressure indication for all channels. Head of line
blocking is not an issue because all traffic is being directed to a
single port--the ATM layer device.)
[0148] In systems with duplicated core cards the state of an
upstream link is determined (logically) by the state of the active
bit on the corresponding downstream link. Upstream links
corresponding to inactive downstream links continue to monitor for
errors, but can be programmed to stay in the FIFO reset state so
that upstream traffic on the spare link is discarded by the VORTEX
(On the spare core card embedded control channel cells that are
passed through to the ATM layer should normally be left to pass
through (i.e. the FIFO should not be held in reset). Optionally,
the inactive links can be programmed to function normally, thereby
leaving the spare core card's ATM layer responsible for processing
cells quickly enough to prevent buffer overflow in the VORTEX.
Buffer overflow on an inactive upstream link is possible because
the DUPLEX ignores flow control (back-pressure) information from
the inactive link. This must be taken into consideration when
determining how or if upstream user cells are handled on spare
links.
[0149] In some situations less than 8 of the VORTEX serial links
will be equipped. This occurs if line cards are not equipped, or if
a shelf does not have a multiple of 8 line card slots. In this
situation the unequipped serial links can be disabled through
software. The serial signal pins for disabled links are left in
tri-state.
[0150] In summary, in particular embodiments, the upstream data
path on the uplink card (second stage multiplexing) operates
according to the following:
[0151] 1. Upstream cells are sent by the DUPLEX as soon as they are
received from the PHYs. In protected systems the cells are sent
simultaneously on both the active and inactive links.
[0152] 2. The DUPLEX will temporarily suspend sending data from all
PHYs when the active link is asserting back-pressure. Back-pressure
from the inactive link is ignored.
[0153] 3. The serial link and core card bus bandwidths are
sufficient to ensure that, in a properly engineered system, any
back-pressure is temporary and will not result in buffer overflow
at the PHYs.
[0154] 4. The VORTEX services its eight serial links in a simple
weighted round robin fashion.
[0155] 5. As each cell is received by the VORTEX it is tagged with
a link ID (0 . . . 7) and a VORTEX ID (0 . . . 31) and made
available to the ANY-PHY bus as discussed below in Multiplexing on
the Core Card (Stage 3 Multiplexing).
[0156] 5.2.3. Downstream (to the Line Card) Traffic
[0157] In a typical system, the WAN port will operate at a much
higher rate than any of the individual PHYs on the line cards and
therefore, under burst conditions, downstream traffic should be
buffered until the respective PHY is able to receive it.
[0158] As with the upstream direction, the system designer of an
access multiplexer or switch is faced with three choices: (1) put
buffering on the line card and send cells to the line card as soon
as the cell is received from the WAN port; or (2) only send cells
to a PHY when the PHY can accept them, leaving cells buffered on
the core card until the PHY clears (This is the approach used in
the VORTEX and DUPLEX architecture;) or (3) put buffering on all
port cards, separated by a switching fabric. As is the case in the
upstream direction, the first approach requires significant
over-engineering of the total amount of buffer space required
system-wide and drives these costs onto the line card. It cannot
take advantage of the statistical gain made by centralizing the
buffering of traffic bursts destined to a large number of line
cards. (As was mentioned previously, in access applications each
line may be idle for significant portions of the time. Further,
access speeds vary widely depending on the services being offered.
Rate adaptive services such as ADSL vary their speed based the loop
conditions of individual customers. Taken together, this wide
degree of per line variability lends itself to significant
statistical gain if the traffic buffering is centralized.) To
prevent internal bottlenecks the downstream data-path to every line
card must be at least as fast as the WAN port if a simple
downstream broadcast is used. Otherwise the core card will need to
perform demultiplexing on the data stream and only send the cell to
its appropriate line card.
[0159] One potential negative of the approach implemented by the
VORTEX and DUPLEX is that the core card's traffic management device
must function across a greater number of PHYs than a traffic
management device on a single line card. However, a particular APEX
Traffic Management device has been designed to perform a full
featured traffic and buffer management function across 2048 PHYs
while interfacing directly to any number of VORTEX devices.
[0160] As described, the DUPLEX sends the VORTEX the logical
equivalent of each PHY's FIFO TCA status. (In a particular
embodiment, the DUPLEX indicates the status of an internal cell
deep buffer associated with each PHY. However, since the status of
this buffer (full or empty) depends ultimately on the corresponding
PHY's transmit FIFO it is accurate to characterize this
back-pressure as a "proxy TCA" indication.) The information is sent
on both the active and inactive links, but the remainder of this
section, will only be discussing the behavior of active serial
links. In effect, each VORTEX appears to the traffic manager as a
256 port multi-PHY. This is depicted in FIG. 5.
[0161] For each of its 8 serial links the VORTEX provides an
internal cell buffer for each of the maximum 32 PHYs supported by
the downstream DUPLEX. This allows cell transfers to occur on
different serial links simultaneously. The ANY-PHY bus is four
times faster than the serial link, so this ensures the full 800
Mbps can be used. Each link schedules cells from its 32 cell buffer
(one per downstream PHY) on a strictly round robin basis, where
PHYs without downstream cells are skipped over of course.
[0162] It should be noted that the maximum effective downstream
bandwidth for a single PHY is between 1/2 and 1/4 the bandwidth of
the serial link depending on the timing of the ATM layer device
acting as bus master. (The aggregate bandwidth of ail the PHYs on
multi-PHY line card can be about 90% of the serial link rate. The
1/2 to 1/4 link rate restriction being discussed here applies only
to each individual PHY.) This is due to buffer "high water mark"
levels and internal back-pressure signal timing constraints. VORTEX
registers can be configured in software to adjust the buffer fill
level for line cards with very fast PHYs. An exception to this is
when a single PHY is connected to the DUPLEX. In that case the full
bandwidth of the serial link (up to approximately 90%) is
available.
[0163] In summary, the downstream data path in the second stage of
multiplexing is:
[0164] 1. The VORTEX provides 256 "proxy TCA" signals that can be
used directly by the ATM traffic management device to safely
schedule cells directed to the PHYs on the line card.
[0165] 2. The serial link bandwidth, the core card bus bandwidth,
and the internal buffering in the VORTEX and DUPLEX devices are all
sufficient to ensure that downstream traffic destined to one PHY
does not block downstream traffic destined to another PHY
regardless of where the two PHYs are located (i.e. whether they are
on the same or different line cards).
[0166] 3. Each link's active indication (in the downstream
direction) can be set individually by software. Inactive links (on
the VORTEX) still present TCA information to the ATM layer device,
but any cells sent on these links will be discarded by the
DUPLEX.
[0167] 4. As the VORTEX receives a downstream cell from the ATM
layer the in-band prepended address is decoded and used to route
the cell to the appropriate link and the appropriate internal PHY
buffer.
[0168] 5.3. Multiplexing on the Core Card (Stage 3
Multiplexing)
[0169] As was discussed in the previous section, in many ways the
VORTEX acts as a proxy for all the PHYs attached to the
corresponding DUPLEX devices. In one implementation, because each
DUPLEX can interface to a maximum of 32 PHY devices, each VORTEX
can link to eight DUPLEX, each VORTEX can represent up to
8.times.32=256 PHY devices. Further, there can be up to 31 VORTEX
devices directly addressed on a single bus, although electrical
limits on the bus may restrict the number of devices to be less.
This, therefore, provides potentially a large number of PHYs or
ports on an channel uplinking to a WAN interface.
[0170] This section discusses how the traffic from this many PHY
devices is handled and addressed by the ATM layer in a particular
implementation. For reasons that are discussed below, the upstream
and downstream directions are handled differently. The remainder of
this section discusses how traffic on the active core card's data
path is handled.
[0171] 5.3.1. Upstream (to the Core Card) Traffic
[0172] A problem encountered in any multiplexer of lower speed
ports up to a high-speed channel is that the high-speed channel
(such as the ATM layer) needs to know which port each cell came
from. In ATM, there are two obvious choices, rely on a unique field
within the original cell to identify the source, or add a small,
unique PHY ID tag to each cell. Although it might be tempting to
use the existing VPI/VCI address field present in every ATM cell to
uniquely identify its port, this is not something that can be
guaranteed unless the VPI/VCI values are restricted--not generally
a desired solution.
[0173] In the present invention, what works better is to "tag" each
cell with a prepended physical port ID, and then send the cell to
the core card for processing. By adding a short tag to each cell
the data path becomes fully "protocol neutral" and makes no
assumptions about the contents or address fields within the user
cells.
[0174] As was described, in a particular implementation, each
DUPLEX and VORTEX multiplexes and tags its upstream traffic into a
single stream of cells, which are offered one by one to the ATM
layer by the VORTEX. Every upstream cell is tagged with prepend
bits that uniquely identify the VORTEX ID (0 . . . 31), link ID (0
. . . 8), and PHY ID (0 . . . 31) (There is also a control channel
cell identification, as discussed elsewhere herein.) to which the
cell belongs. Since the physical source (i.e. the port) of each
upstream cell is self-identified by its tag, the VORTEX can act as
if the stream of upstream cells is coming from a single PHY. Put
another way, in the upstream direction each VORTEX appears as a
single PHY slave device supplying a stream of possibly expanded
length cells (Optionally the PHY id tag can be passed in the
UDF/HEC word without expanding the cell length.) to the ATM layer.
This upstream tagging is illustrated in FIG. 5.
[0175] To accommodate more than one VORTEX on the bus, the ATM
device need only poll the Receive Cell Available (RCA) status line
of each of the VORTEX devices. Simple round robin polling of the
devices will normally suffice because back-pressure is used for
flow control on the individual serial links, and each VORTEX
performs a user programmable weighted round robin polling of its 8
serial links.
[0176] In summary, the upstream data path on the core card in one
embodiment, operates according to the following method:
[0177] 1. Each VORTEX services its eight serial links in a simple
weighted round robin fashion.
[0178] 2. As each cell is received the VORTEX tags it with a link
ID (0 . . . 7) and a VORTEX ID (0 . . . 31) and makes it available
to the ANY-PHY bus. Since the DUPLEX will have added a PHY ID tag,
the accumulated tag uniquely identifies the cell's source port.
This is shown in FIG. 1.
[0179] 3. The ATM layer device, acting as bus master, polls the RCA
status of each of the VORTEX devices present on the bus. Each
VORTEX looks like a single PHY to the ATM layer device. When a
VORTEX indicates that it has a cell available it is read in by the
ATM layer device for processing.
[0180] 4. The ATM traffic management device is responsible for
buffering upstream cells until they can be forwarded to the WAN
link, sent back to a line card (for line to line switching), or
otherwise processed.
[0181] 5.3.2. Downstream (to the Line Card) Traffic
[0182] To the ATM layer each of the VORTEX devices on the core
card's ANY-PHY bus appears, in the downstream direction, as if it
were a 256+8 port multi-PHY. (The extra 8 channels are the embedded
control channel per link.) Further, there can be numerous VORTEX
devices on the bus, limited mainly by electrical loading. In order
to schedule cell traffic into these numerous PHYs the ATM layer
device must efficiently perform three related functions: TCA status
polling, PHY selection, and cell transfer.
[0183] On each bus cycle the ATM layer device, which is bus master,
can poll the status of an individual PHY's proxy TCA status. It
does this by presenting a PHY address on the ANY-PHY bus address
lines. If the VORTEX acting as proxy for the polled PHY has room
for the cell in its internal buffer then it will raise its TCA line
two bus clock cycles after the polling address is presented. All
other VORTEX on the bus will tri-state their TCA line. As discussed
below, PHY polling uses a different addressing mechanism than PHY
selection for cell transfer. This gives guaranteed and
deterministic access to polling bandwidth over the transmit address
lines.
[0184] None of the VORTEX devices on the bus will respond to a NULL
address (all ones) poll. The ATM layer device will insert a NULL
address between valid PHY addresses to give the previously polled
VORTEX time to tri-state the TCA line. If the TCA line is not
shared among slave devices then a PHY addresses can be presented on
every bus cycle.
[0185] For PHY selection (to initiate cell transfer) in the
downstream direction in-band addressing is used. With this scheme
the ATM layer device prepends the selected PHY address to the
transmitted cell. The TSX (transmit start of transfer) bus signal
is asserted by the ATM layer device during the first cycle of a
data block transfer (coinciding with the PHY address) to mark the
start of a block transfer period.
[0186] All VORTEX devices on the bus receive all cells. It is up to
the appropriate VORTEX to recognize the PHY address and latch in
the remainder of the cell. The other VORTEX devices will simply
ignore the cell. The mapping of the PHY address space of each
VORTEX is programmed by software at device initialization time.
FIG. 7 shows the Downstream Cell Overhead.
[0187] In summary, the downstream data path on the core card in one
embodiment, operates according to the following method:
[0188] 1. Traffic arrives from the WAN link and is buffered by the
ATM layer.
[0189] 2. The ATM layer prepends each cell with a 12 bit PHY
address that identifies the VORTEX, link, and PHY (or identifies
the VORTEX, link, and the embedded control channel). This
information is used by the VORTEX to determine whether a cell that
is placed on the ANY-PHY bus by the ATM layer should be read in or
ignored.
[0190] 3. For each PHY to which the VORTEX is connected (via its 8
DUPLEXs) the VORTEX provides a proxy TCA signal that is polled by
the ATM layer via external address lines.
6. Example APEX Architecture Description
[0191] FIG. 7 shows a function block diagram of an APEX ATM traffic
manager 200 according to a specific embodiment of the invention, as
described in greater detail in the above referenced documents. The
functional diagram is arranged such that cell traffic flows through
the APEX from left to right. The APEX is a full duplex ATM traffic
management device, providing cell switching, per VC queuing,
traffic shaping, congestion management, and hierarchical scheduling
to up to 2048 loop ports and up to 4 WAN ports.
[0192] The APEX provides per-VC queuing for 64K VCs. A per-VC queue
may be allocated to any Class of Service (COS), within any port, in
either direction (ingress or egress path). Per-VC queuing enables
PCR or SCR per-VC shaping on WAN ports and greater fairness of
bandwidth allocation between VCs within a COS.
[0193] The APEX provides three level hierarchical scheduling for
port, COS, and VC level scheduling. There are two, three level
schedulers; one for the loop ports 205 and one for the WAN ports
210. The APEX supports up to 256k cells of shared buffering in a
32-bit wide SDRAM, accessed through interface 220. Memory
protection is provided via an inband CRC-10 on a cell by cell
basis.
[0194] AAL5 SAR assistance 240 is provided for AAL5 frame-traffic
to and from the uP. The APEX provides a 32-bit microprocessor bus
interface through 250 for signaling, control, cell and frame
message extraction and insertion, VC. Class and port context
access, control and status monitoring, and configuration of the IC.
Microprocessor burst access for registers, cell and frame traffic
is supported. The APEX provides a 36-bit SSRAM interface 260 for
context storage supporting up to 4 MB of context for up to 64 k-VCs
and up to 256k cell buffer pointer storage. Context Memory
protection is provided via 2 bits of parity over each 34-bit
word.
7. Example VORTEX Architecture Description
[0195] FIG. 8 shows a functional block diagram of a VORTEX 300 that
may embody aspects of the invention Main components of the VORTEX
includes a set of per-serial-link control functions 305. For each
serial link there is a cell processor 315, 33 cell per-PHY transmit
buffer 325, a five-cell FIFO 335, one-cell incoming microprocessor
buffer 330, and four-cell incoming microprocessor FIFO 345. The
microprocessor interface 310 is provided for device configuration,
control and monitoring by an external microprocessor. Normal mode
registers and test mode registers can be accessed through this
port. Test mode registers via 320 are used to enhance the
testability of the VORTEX. The interface has an 8-bit wide data
bus. Multiplexed address and data operation is supported.
[0196] To provide flexibility, two mechanisms are provided for the
transport of a control channel. Control channel cells can be
inserted and extracted either via the microprocessor interface or
via an external device transferring control channel cells across
the Any-PHY interface 320b. The control channel cell insertion and
extraction capabilities provide a simple unacknowledged cell relay
capability. For a fully robust control channel implementation, it
is assumed the local microprocessor and the remote entity are
running a reliable communications protocol.
[0197] The VORTEX contains a one cell buffer 330 per high-speed
link for the insertion of a cell by the microprocessor onto the
high-speed serial links. Optional CRC-32 calculation relieves the
microprocessor of this task.
8. Example DUPLEX Architecture Description
[0198] FIG. 9 shows the functional block diagram of the DUPLEX 400.
Although separated to improve clarity, many signals in the
following diagram share physical package pins. The use of the
Any-PHY interfaces and the clocked serial data interfaces is
mutually exclusive. In one embodiment, the DUPLEX is ATM specific.
It exchanges contiguous 53 byte cells with PHY devices. The PHY
interface can be either clocked serial data or Any-PHY Level 2.
[0199] With an Any-PHY interface, the DUPLEX coordinates cell
exchanges with up to 32 modems. In the upstream direction, the
modems are polled in a pure round robin manner and cells are queued
in two cell FIFOs 410 dedicated to each modem. In the downstream
direction, the cell buffer is logically partitioned into a four
cell FIFO for each modem to avoid head-of-line blocking. Those
modems associated with non-empty FIFOs are polled round robin. An
extended cell format provides four extra bytes for the encoding of
flow control, timing reference, PHY identification and link
maintenance information. A redundant (or spare) link is provided to
allow connection to two cell processing cards.
[0200] The DUPLEX Microprocessor 420 Interface is provided for
device configuration, control and monitoring by an external
microprocessor. Normal mode registers and test mode registers can
be accessed through this port. A cell insertion and extraction
capability provides a simple unacknowledged cell relay
capability.
[0201] In the upstream direction, control channel cells are
broadcast on both the active 422a and spare 422b high-speed serial
links. The contents of the cells shall distinguish the two control
channels if necessary. In the downstream direction, each high-speed
serial link has a dedicated queue for the control channel
cells.
[0202] The control channel is treated as a virtual PHY device. In
the upstream direction, it is scheduled with the same priority as
the other logical channels. In the downstream direction, control
channel cells are queued in a four cell FIFO for each high-speed
serial link. If either FIFO contains two or more cells, the cell
available bit, UPCA, returned upstream is deasserted to prevent
cell loss when the microprocessor cell reads fail to keep pace with
the incoming control channel cells.
[0203] The DUPLEX contains a one cell buffer 430 for the insertion
of a cell by the microprocessor into the high-speed serial
interface. All cells written by the microprocessor will have binary
111110 encoded in the PHYID[5:0] field within the cell prepend
bytes. This distinction between user cells and control cells
provides a clear channel for both types of cells.
[0204] By default, cells received on the high-speed serial link
will be routed to the Microprocessor Cell Buffer FIFO 440 if the
PHYID[5:0] prepend field is binary 111110. The control channel
cells can be programmed to be routed to the Any-PHY interface
instead Buffer 440 has a capacity of four cells. The UPCA bit
returned on the upstream high-speed serial link will be set to
logic 0 when the buffer contains more than two cells. This shall
prevent overflow of the local buffer if the indication is responded
to within two cell slots.
9. Other Embodiments
[0205] The invention has now been described with reference to
specific embodiments, including details of particular circuit
implementations incorporating aspects of the invention Other
embodiments will be apparent to those of skill in the art. In
particular, methods according to the invention can be used in a
wide variety of communications applications different from those
illustrated herein.
[0206] It is understood that the examples and embodiments described
herein are for illustrative purposes and that various modifications
or changes in light thereof will be suggested by the teachings
herein to persons skilled in the art and are to be included within
the spirit and purview of this application and scope of the claims.
All publications, patents, and patent applications cited herein are
hereby incorporated by reference in their entirety for all
purposes.
* * * * *