U.S. patent application number 11/303561 was filed with the patent office on 2006-07-13 for transfer of control data between network components.
Invention is credited to Noam Avni, Gershon Bar-On, Luke Chang, Benzi Ende, Sorana Lazarovichi, Simcha Pearl.
Application Number | 20060153238 11/303561 |
Document ID | / |
Family ID | 34678113 |
Filed Date | 2006-07-13 |
United States Patent
Application |
20060153238 |
Kind Code |
A1 |
Bar-On; Gershon ; et
al. |
July 13, 2006 |
Transfer of control data between network components
Abstract
A method and apparatus for transfer of power state data between
network components. An embodiment of a method includes determining
a command for a computer system, the computer system including a
first network component and a second network component. The first
network component and the second network component are linked by an
interface. The method further includes inserting a message
regarding the power state change in a data frame and transferring
the data frame from the first network component to the second
component via the interface, the data frame being transferred in a
period between data packets.
Inventors: |
Bar-On; Gershon; (DN Mizrah
Binimamin, IL) ; Ende; Benzi; (Maale Adumim, IL)
; Pearl; Simcha; (DN Mazrah Binimamin, IL) ;
Lazarovichi; Sorana; (Jerusalem, IL) ; Chang;
Luke; (Aloha, OR) ; Avni; Noam; (Jerusalem,
IL) |
Correspondence
Address: |
BLAKELY SOKOLOFF TAYLOR & ZAFMAN
12400 WILSHIRE BOULEVARD
SEVENTH FLOOR
LOS ANGELES
CA
90025-1030
US
|
Family ID: |
34678113 |
Appl. No.: |
11/303561 |
Filed: |
December 15, 2005 |
Related U.S. Patent Documents
|
|
|
|
|
|
Application
Number |
Filing Date |
Patent Number |
|
|
10741314 |
Dec 19, 2003 |
|
|
|
11303561 |
Dec 15, 2005 |
|
|
|
Current U.S.
Class: |
370/473 |
Current CPC
Class: |
H04L 12/40 20130101;
H04L 12/40013 20130101; H04L 12/40136 20130101 |
Class at
Publication: |
370/473 |
International
Class: |
H04J 3/24 20060101
H04J003/24 |
Claims
1. A method comprising: determining a command for a computer
system, the computer system including a first network component and
a second network component, the first network component and the
second network component being linked by an interface; inserting a
control message regarding the command in a data frame; and
transferring the data frame from the first network component to the
second component via the interface, the data frame being
transferred in a period between data packets.
2. The method of claim 1, wherein the first and second network
components are Ethernet components.
3. The method of claim 2, wherein the first network component is a
MAC (media access control) Ethernet component.
4. The method of claim 3, wherein the second network component is a
PHY (physical) Ethernet component.
5. The method of claim 1, wherein the control message is sent in a
gap between data packets.
6. The method of claim 1, wherein the control message includes a
power state change.
7. The method of claim 6, further comprising changing the power
state of the second network device based at least in part on the
message regarding the power state change.
8. A network apparatus comprising: a first component; and a second
component, the second component to be coupled with the first
component via an interface, the second component to transfer a
plurality of data packets to the first component, the first
component to transfer a data frame in a gap between a first data
packet and a second data packet, the data frame including a control
message for the first component; wherein the first component
changes from a first state to a second state in response to the
data frame.
9. The network apparatus of claim 8, wherein the first component is
an Ethernet PHY (physical) component.
10. The network apparatus of claim 9, wherein the second component
is an Ethernet MAC (media access) component.
11. The network apparatus of claim 8, wherein the control message
is a power control message.
12. The network apparatus of claim 11, wherein the first component
changes from a first power state to a second power state in
response to the power control message.
13. The network apparatus of claim 11, wherein the interface does
not include a power control line.
14. The network apparatus of claim 11, wherein the first component
is coupled with a first port and wherein the data packets are to be
transferred to the first port.
15. The network apparatus of claim 10, wherein the first component
is coupled with a second port, wherein the first component has a
first power state for the first port and a second power state for
the second port.
16. A system comprising: a bus; a processor coupled to the bus; a
dynamic memory coupled to the bus to hold data for transmission; a
communication device coupled to the bus to transmit and receive
data, the communication device including: a physical network
device; and a media access network device, the media access device
to transfer a plurality of data packets from the processor to the
physical network device, the media access device to transmit a
control signal to the physical network device in a period between a
first data packet and a second data packet.
17. The system of claim 16, wherein the control signal includes a
power control signal.
18. The system of claim 17, wherein the physical network device
changes from a first power state to a second power state in
response to the power control signal.
19. The system of claim 18, wherein the physical network device
consumes less power in the second power state than in the first
power state.
20. The system of claim 18, wherein the physical network device is
coupled with a plurality of ports.
21. The system of claim 20, wherein the power control signal
includes a plurality of power control states, the power control
states including a first control power state for a first port and a
second power control state for a second port.
22. The system of claim 16, wherein the communication device is an
Ethernet device.
23. The system of claim 22, wherein the control signal is
transmitted in an interpacket gap between Ethernet data
packets.
24. The system of claim 16, wherein the control signal is a part of
a control data frame.
25. A machine-readable medium having stored thereon data
representing sequences of instructions that, when executed by a
machine, cause the machine to perform operations comprising:
sending a first data packet from a first network component to a
second network component; sending a data frame from the first
network component to the second network component after the end of
the first data frame, the data frame including a control field;
sending a second data packet from the first network component to
the second network component, the second data packet being sent
after the end of the data frame, the data frame being sent in a
time period between the first data packet end the second data
packet; and changing a state of the second network device based at
least in part on the control field.
26. The medium of claim 25, wherein the data frame is an in-band
frame sent in an inter-packet gap between the first data packet and
the second data packet.
27. The medium of claim 25, wherein the control field encodes the
current power state for the second network device.
28. The medium of claim 25, wherein the first network component and
the second network components are components of an Ethernet
communication device.
Description
RELATED APPLICATION
[0001] This application is a continuation-in-part of U.S. patent
application Ser. No. 10/741,314, filed Dec. 19, 2003.
FIELD
[0002] An embodiment of the invention relates to computer networks
in general, and more specifically to transfer of control data
between network components.
BACKGROUND
[0003] In a computer network, there are generally multiple layers
in operation. For example, Ethernet includes the PHY (physical) and
MAC (media access) layers. In many cases these represent separate
network devices or components, such as separate Ethernet PHY and
MAC devices, which communicate with each other via an interface
between the devices.
[0004] In the operation of network devices, there is an increasing
need to consider power considerations and the power drain of the
devices. In order to save power, some conventional systems have
certain lower power states to allow a system to reduce power
consumption when activity is at reduced levels. Because of the
layered nature of a network, there is may be a need to transfer
power information between components in order to control power
usage.
[0005] However, the transfer of control and status information,
such as power information, may cause complications for certain
network component interfaces. If a limited interface between
network components is used to simplify the structure for the
network components, then the transfer of control information
between the components may be difficult. Adding a control
interface, such as a power state interface, may require sideband
communications and an increased pin count for the interface, which
thus results in more complex design of network devices. For
example, conventional stand-alone PHY devices often do not include
a power state interface because of the pin-count limitation of the
interface for the device. As a result, the devices may consume more
energy than would be necessary for a device for which power save
features have been implemented.
[0006] Semiconductor devices in a printed circuit board (PCB)
typically communicate through a device-to-device interconnection
(DDI). Such a DDI typically includes copper traces formed in the
PCB to transmit signals between devices. A device may be coupled to
a DDI by solder bonding or a device socket secured to the PCB.
[0007] Cisco Systems has promoted a Serial Gigabit Media
Independent Interface (SGMII) format for transmitting Ethernet data
frames between devices over a DDI according to a differential pair
signal format. In particular, SGMII specifies the transmission of
Ethernet data frames as 8 B/10 B code groups. Control information
may be transmitted in an out-of-band control channel coupled
between the devices.
[0008] IEEE Std. 802.3ae-2002, Clause 47 defines a 10 Gigabit
Attachment Unit Interface (XAUI) for transmitting data between
devices in data lanes. Each data lane typically transmits a serial
data signal between the devices using a differential signaling
pair. A XAUI is typically coupled to a 10 Gigabit Media Independent
Interface (XGMII) which is capable of transmitting or receiving
data at a data rate of ten gigabits per second. In addition, the
XAUI format may be used in transmitting data over an Infiniband 4x
cable as described in the proposed 10GBASE-CX4 standard presently
being explored by the IEEE P802.3ak working group. A
"device-to-device interconnection" (DDI) as referred to herein
relates to a data link to transmit data between devices. For
example, a DDI may be formed by conductive traces formed on a
circuit board between device sockets to receive devices. A DDI may
traverse multiple devices coupled between two devices over a
backplane and comprise conductive traces coupling the devices to
one another. In another example, a DDI may comprise a cable coupled
between two connectors at opposite ends of the cable. Each
connector may then transmit data between the cable and a device
coupled to the connector by conductive traces. However, these are
merely examples of a DDI and embodiments of the present invention
are not limited in these respects.
BRIEF DESCRIPTION OF THE DRAWINGS
[0009] The invention may be best understood by referring to the
following description and accompanying drawings that are used to
illustrate embodiments of the invention. In the drawings:
[0010] FIG. 1 is an illustration of network devices in an
embodiment of the invention;
[0011] FIG. 2 is an illustration of a data frame used to transport
control data in an embodiment of the invention;
[0012] FIG. 3 is a diagram of a possible in-band frame for control
data in an embodiment of the invention;
[0013] FIG. 4 is a flowchart to illustrate an embodiment of the
transfer of power state data;
[0014] FIG. 5 is a flowchart to illustrate an embodiment of
modification of power state in response to power command data;
[0015] FIG. 6 is an illustration of a computer system that may be
utilized in conjunction with an embodiment of the invention;
and
[0016] FIG. 7 shows a schematic diagram of devices coupled by a
device-to-device interconnection (DDI) according to an embodiment
of the invention.
DETAILED DESCRIPTION
[0017] A method and apparatus are described for transfer of control
data between network components.
[0018] In one embodiment of the invention, control data is passed
from a first network component to a second network component via a
limited interface. In one embodiment, the control data includes
power state data. In an embodiment, passing of control data is
accomplished without additional side-band signals and without
requiring software intervention.
[0019] In an embodiment of the invention, a network component
passes control data to another component by inserting command data
in an in-band frame between data packets. In an embodiment, a frame
is transferred that includes one or more control fields, which may
include, but is not limited to, a power control field. In
embodiment, the receiving device receives the in-band frame,
interprets the power control field, and adjusts the current power
state based at least in part on the data contained in the
field.
[0020] For the purposes herein, control data includes any data
other than content data, and may include messages to notify a
recipient of events, status, requests, or configuration commands.
Control data includes, but is not limited to, power state data or
other power information.
[0021] Ethernet is generally a physical and data link layer
technology for local area networks (LANs). In the OSI (Open systems
interconnection) model of layers, Ethernet technology operates at
the physical (PHY) layer and the media access (MAC) portion or
sublayer of the data link layer. While structures may vary in
different systems, often the MAC and PHY functions are separate
devices.
[0022] In the operation of computer systems, the conservation of
power is often of prime importance. In network operations, a PHY
device connected to and driving signals on the physical cabling of
a network may consume significant power. However, the structure of
the connections between elements complicates efforts to conserve
power.
[0023] Computer components in general often have different power
states, with the power states representing different levels of
operation and power consumption. For example, power states are
defined in general in relevant specifications, including the PCI
(peripheral component interconnect) Bus Power Management
specification (for example, PCI Power Management 1.2) and the
Advanced Configuration and Power Interface Specification (for
example, ACPI Revision 3.0, Sep. 2, 2004). The PCI Bus Power
Management specification is intended to enhance the PCI
architecture by including standardized power-management
capabilities. This specification is architecturally aligned with
the ACPI specification, and enables PCI devices to participate in
platform-wide and operating system directed power management.
[0024] In this power management structure, four power states are
defined for devices, with the states being defined for each device
in terms of power consumption, device context (how much of the
context of the device is retained and does not need to be restored
by the operation system), the requirements for the device driver to
restore the device to full operation, and the length of time
required to restore the device to full operation. These states may
be designated as D0 (fully on), intermediate states D1 and D2 (with
reduced power consumption and less context retained), and D3
(device off). The power saving level thus is derived from the
current power state.
[0025] Because a PHY device (or other related device) is often one
of the largest power consumers in a system, power savings can be
achieved by controlling the power consumption of the PHY through
the passing of power control commands. However, if the PHY is not
integrated with the MAC, then passing power signals from MAC to the
PHY may be difficult. In particular, if the MAC and PHY utilize a
reduced MAC/PHY interface, there are fewer options for signal
transmission. For example, it may be necessary to utilize extra
sideband signals to pass power states because there is commonly no
power state interface. As an alternative, power states could be
programmed through, for example, a register that could hold the
current power state. However, this operation requires extra
software intervention in the system operation. A PHY device could
in some instances have a pin to shut down or disable the PHY, but
this does not provide smart power management. Each port may have
its own power state, but there is no per port power savings
available.
[0026] Under an embodiment of the invention, a control command is
passed to the PHY without software intervention. In an embodiment
of the invention, a control command is passed to the PHY without
additional side-band signals. In an embodiment of the invention,
the control command includes a power state. Embodiments of the
invention are not limited to power states, and may include any
control data that is transferred between components or elements. In
an embodiment of the invention, a smart power saving algorithm may
be implemented in the PHY through use of transferred control data.
Further, a PHY may include smart power saving on a per port
basis.
[0027] In one particular example, a reduced interface may exist
between Ethernet MAC and PHY, with the reduced interface being used
to minimize the routing between the MAC and the PHY. In an
embodiment of the invention, the interface may use in-band frames
to provide control information. For the purposes of this
description, an in-band frame is a frame that is transferred with
data packets or content information, in contrast with an
out-of-band message. In an embodiment of the invention, the in-band
data is transferred in the gap between data packets. The interface
will define the relevant frame fields for the device.
[0028] Ethernet devices are required to allow a minimum idle period
between transmission of frames known as the interframe gap (IFG) or
interpacket gap (IPG). The IPG provides a recovery time period
between frames, which allows a device time to prepare for reception
of the subsequent frame. Generally the minimum IPG for Ethernet is
96 bit times, which is 9.6 microseconds for 10 Mb/s Ethernet, 960
nanoseconds for 100 Mb/s Ethernet, and 96 nanoseconds for 1 Gb/s
Ethernet. Under an embodiment of the invention, the control data is
transported in a frame between the data frames, an in-band frame
transferred during the IFG. In one embodiment of the invention,
power states are transported in such in-band frames.
[0029] In an embodiment of the invention, sending control states
between network devices may allow for simplified board design for
network devices, while enabling smart power saving algorithms for a
network device. For example, a PHY device may be simplified in
design because of the limited interface. However, the power
consumption of the PHY device may be controlled through the
transport of power states between devices.
[0030] FIG. 1 is an illustration of network devices in an
embodiment of the invention. In this illustration, a first network
device is a PHY device 105 and a second network device is a MAC
device 110. The PHY device 105 and the MAC device 110 are coupled
by an interface 115, which may be linked through a connector. The
PHY device 105 is coupled to certain ports 120, which may used for
transmitting and receiving data. For example, a series of inbound
frames of data 130 may be received at a port and transferred by the
PHY device 105 to the MAC device 110. After the MAC device 110
processes and validates a frame of data, the frame is sent on to
network devices 125. The data flow may occur in both directions,
and an outbound flow of data 135 is also illustrated.
[0031] The interface 115 may vary in different systems. In one
example, the interface is a reduced interface that minimizes the
number of interconnections between the PHY device 105 and the MAC
device 110. However, the use of a reduced interface does not
provide for paths for sideband communications that may be used to
transfer power states. In an embodiment of the invention, the
interface uses an in-band frame 140 sent in a gap between data
packets to carry control data. In an embodiment of the invention,
an in-band control frame includes a power state field that is used
to control power consumption. In an embodiment, the frame includes
a power state field that may instruct the PHY to reduce power
consumption by moving the PHY into a lower power state. When the
power state changes, then in-band status and control frame with the
needed power state change is provided. While a single frame is
shown, any number of in-band frames may be transferred, with the
MAC possibly sending an in-band frame between each two data
packets, and may potentially send more than one in-band frames
between two data packets. In an embodiment of the invention, the
PHY device may also utilize in-band frames to communicate other
control messages to the MAC in the inbound data stream 130. Among
other types of messages, the PHY may send a confirmation message in
response to a command, or an error message if a command appears to
contain an error.
[0032] FIG. 2 is an illustration of a data frame used to transport
control data in an embodiment of the invention. In this
illustration, a first packet of data 205 is followed by a second
packet of data 210. Between the two frames is the IPG 215, which is
the expected gap between two packets of data. In an embodiment of
the invention, between the two frames is an in-band frame 220, a
frame that begins after the end of the first packet 205 and ends
before the beginning of the second packet 210. In an embodiment,
the in-band frame 220 includes a power control field, the power
control field providing the current power state for the network. In
one example, the first packet 205 and the second packet 210 are
data packets being transferred from an Ethernet MAC device to an
Ethernet PHY device. In this example, the PHY device will read the
in-band frame 220 and determine, for e.g., whether a change in
power state has occurred or other control change has been made.
Based at least in part on the power state information contained in
the in-band frame 220, the PHY may reduce operations and transition
to a lower power state to conserve power, or may power back up to a
higher state to enable more functionality.
[0033] FIG. 3 is a diagram of a possible in-band frame for control
in an embodiment of the invention. The in-band frame provides one
example of a frame that may be used, but embodiments of the
invention may utilize any structure or order of fields in the
frame. In this example, the frame is a particular length, in this
particular case 39 bits long. The frame may provide for multiple
control and status states or commands in this example a first bit
305 represents a type, and a second bit a "done" field 310. There
are then 14 bits reserved for further use 315 and a bit for reset.
Following the reset bit are two bits, with may be used to encode
one of four different power states, the possible power states being
the D0, D1, D2, and D3 states for the particular device. In this
example, there are also four bits for control of LED's for display
330, an additional eight reserved bits 335, and a CRC (cyclic
redundancy check) word for error detection for the frame.
[0034] FIG. 4 is a flowchart to illustrate an embodiment of the
transfer of power state data. In this embodiment, the transfer of
data frames from a first network device (such as a MAC device) to a
second network device (such as a PHY device) is illustrated for
simplicity, but there will likely also be traffic from the second
network device to the first network device, in addition to other
complicities that are not illustrated here. The flowchart is
limited to an illustration of the transfer of power state data, but
embodiments are not limited to this example, and other control
information may be handled in a similar manner.
[0035] In one embodiment, the first device received data packets
periodically for transmission 405. The MAC device sends the data
packet to the PHY device 410, as normal operations. If there are no
control messages to be sent to the PHY device, the MAC may not send
an in-band frame and simply waits the needed IPG time between
frames 420 before transferring the next frame 410. However if there
is a power state change or another control or state signal is
needed 415, then an in-band frame, including a power state frame,
is inserted in the data stream between data packets 425 before
returning to sending the next data packet 410.
[0036] FIG. 5 is a flowchart to illustrate an embodiment of
modification of power state in response to power command data. In
this illustration, a PHY device will receive a data packet from a
MAC device 505. The PHY will process and deliver the data packet to
the appropriate port for transmission 510. After the data packet,
an in-band frame may be received 515. If there is no in-band frame,
the PHY device will wait the IPG time period before the possible
arrival of another data packet 505. If an in-band frame is
received, the PHY will interpret the in-band frame 525. If there is
a power control state change command 530, the PHY device will
change its power state in response to the command 535. If there are
any other commands 540, these commands may also be implemented 545.
While this diagram for simplicity illustrates the PHY device
complying with a power state change and other commands during the
time period between data packets, the timing of the operations may
vary in different embodiments. The PHY may implement certain
commands after the arrival of the next data packet, or otherwise
vary the timing of the implementation of commands as appropriate in
the context of the operation.
[0037] FIG. 6 is an illustration of a computer system that may be
utilized in conjunction with an embodiment of the invention. Under
an embodiment of the invention, a computer 600 comprises a bus 605
or other communication means for communicating information, and a
processing means such as two or more processors 610 (shown as a
first processor 615 and a second processor 620) coupled with the
bus 605 for processing information. The processors 610 may comprise
one or more physical processors and one or more logical processors.
Further, each of the processors 610 may include multiple processor
cores.
[0038] The computer 600 further comprises a random access memory
(RAM) or other dynamic storage device as a main memory 625 for
storing information and instructions to be executed by the
processors 610. Main memory 625 also may be used for storing
temporary variables or other intermediate information during
execution of instructions by the processors 610. The computer 600
also may comprise a read only memory (ROM) 630 and/or other static
storage device for storing static information and instructions for
the processors 610.
[0039] A data storage device 635 may also be coupled to the bus 605
of the computer 600 for storing information and instructions. The
data storage device 635 may include a magnetic disk or optical disc
and its corresponding drive, flash memory or other nonvolatile
memory, or other memory device. Such elements may be combined
together or may be separate components, and utilize parts of other
elements of the computer 600.
[0040] The computer 600 may also be coupled via the bus 605 to a
display device 640, such as a cathode ray tube (CRT) display, a
liquid crystal display (LCD), a plasma display, or any other
display technology, for displaying information to an end user. In
some environments, the display device may be a touch-screen that is
also utilized as at least a part of an input device. In some
environments, display device 640 may be or may include an audio
device, such as a speaker for providing audio information. An input
device 645 may be coupled to the bus 605 for communicating
information and/or command selections to the processors 610. In
various implementations, input device 645 may be a keyboard, a
keypad, a touch-screen and stylus, a voice-activated system, or
other input device, or combinations of such devices. Another type
of user input device that may be included is a cursor control
device 650, such as a mouse, a trackball, or cursor direction keys
for communicating direction information and command selections to
the one or more processors 610 and for controlling cursor movement
on the display device 640.
[0041] A communication device 655 may also be coupled to the bus
605. Depending upon the particular implementation, the
communication device 655 may include a transceiver, a wireless
modem, a network interface card, or other interface device. In one
embodiment, the communication device 655 may include a firewall to
protect the computer 600 from improper access. The computer 600 may
be linked to a network or to other devices using the communication
device 655, which may include links to the Internet, a local area
network, or another environment. In an embodiment of the invention,
the communication device 655 may comprise an Ethernet or similar
network device. In one embodiment, the communication device 655 may
comprise multiple components, such as an Ethernet PHY device and an
Ethernet MAC device. In an embodiment, the PHY device ad the MAC
device are coupled together and transfer control data, including
power state information, between them. In an embodiment, the power
states and other command data are transferred via in-band frames
from the MAC device to the PHY device, which are placed between
data frames during the IPG period.
[0042] The computer 600 may also comprise a power device or system
460, which may comprise a power supply, a battery, a solar cell, a
fuel cell, or other system or device for providing or generating
power. The power provided by the power device or system 660 may be
distributed as required to elements of the computer 600.
[0043] A "serial data signal" as referred to herein relates to a
signal comprising information encoded into a series of symbols. For
example, a serial data signal may comprise a series of symbols
transmitted in a transmission medium where each symbol is
transmitted in a symbol period. However, this is merely an example
of a serial data signal and embodiments of the present invention
are not limited in these respects.
[0044] A "differential pair signal" as referred to herein relates
to a pair of synchronized signals to transmit encoded data to a
destination. For example, differential pair signal may transmit a
serial data signal comprising symbols to be decoded for data
recovery at a destination. Such a differential pair signal may
transmit each symbol as a voltage on each of two transmission
media. However, these are merely examples of a differential pair
signal and embodiments of the present invention are not limited in
these respects.
[0045] An "8 B/10 B encoding scheme" as referred to herein relates
to a process by which eight-bit data bytes may be encoded into
ten-bit "code groups" (e.g., 8 B/10 B code groups), or a process by
which ten-bit code groups may be decoded to eight-bit data bytes
according to a predetermined "8 B/10 B code group mapping." An "8
B/10 B encoder" as referred to herein relates to logic to encode an
eight-bit data byte to a ten-bit code group, and an "8 B/10 B
decoder" as referred to herein relates to logic to decode an
eight-bit byte from a ten-bit code group. An "8 B/10 B codec" as
referred to herein relates to a combination of an 8 B/10 B encoder
and an 8 B/10 B decoder.
[0046] "Transmission medium" as referred to herein relates to a
medium capable of transmitting data from a source to a destination.
For example, a transmission medium may comprise cabling (e.g.,
coaxial, unshielded twisted wire pair or fiber optic cabling),
printed circuit board traces or a wireless transmission medium.
However, these are merely examples of a transmission medium and
embodiments of the present invention are not limited in these
respects.
[0047] An "Ethernet data frame" as referred to herein relates to a
format for transmitting data in a data link according to a protocol
provided in versions of IEEE Std. 802.3 (e.g., to transmit data
frames according to 10BASE-X, 100BASE-X, 1000BASE-X or 10GBASE-X
protocols). An Ethernet data frame may include, for example, a
header portion including a media access control (MAC) address and a
payload portion including content data to be processed at a
destination. However, this is merely an example of an Ethernet data
frame and embodiments of the present invention are not limited in
these respects.
[0048] An Ethernet data frame may be used to transmit content data
between devices or nodes in a data channel. A "control message" as
referred to herein relates to messages that may be transmitted
between devices or nodes other than content data to notify a node
or device receiving the control message of events, status, requests
or configuration commands. However, these are merely examples of a
control message and embodiments of the present invention are not
limited in these respects. A control message may be transmitted in
a communication channel which is distinct from a data channel as an
"out-of-band" message. Alternatively, a control message may be
inserted or interleaved among content data transmitted in a data
channel as an "in-band" message.
[0049] Briefly, an embodiment of the present invention relates to
the transmission of 8 B/10 B code groups including Ethernet data
frames in a DDI. Control messages may be inserted among the 8 B/10
B code groups for transmission to a destination device. However,
this is merely an example embodiment and other embodiments are not
limited in these respects.
[0050] FIG. 7 shows a schematic diagram of a system 10 for
transmitting data to and receiving data from a node 34 through a
transmission medium 32. The transmission medium 32 may comprise any
one of several mediums suitable for transmitting data in a data
link such as, for example, a cable (e.g., coaxial, unshielded
twisted wire pair or fiber optic) or a wireless transmission
medium. The transmission medium 32 may transmit data between the
node 34 and a data transceiver 12 in Ethernet data frames according
to versions of IEEE Std. 802.3 (e.g., 10 BASE-X, 100BASE-X,
1000BASE-X or 10GBASE-X).
[0051] The data transceiver 12 may be coupled to a controller 18 by
a DDI. The DDI may transmit a first differential pair signal 14
from the data transceiver 12 to the controller 18 and transmit a
second differential pair signal 16 from the controller 18 to the
data transceiver 12. According to an embodiment, each of the first
and second differential pair signals 14 and 16 may be transmitted
in a single pair of conductive traces (e.g., formed in a printed
circuit board, not shown) in the DDI coupled between the data
transceiver 12 and the controller 18. Accordingly, components
containing the data transceiver 12 and the controller 18 may be
coupled to one another by four device pins (not shown) on each
component (where each component comprises two device pins to
transmit or receive differential pair signal 14 and two devices
pins to transmit or receive differential pair signal 16). The
device pins may be coupled to the DDI by solder bonding or device
sockets which are mounted to the DDI and adapted to receive the
components containing the data transceiver 12 and controller 18.
However, these are merely examples of how device pins may be
coupled to a DDI and embodiments of the present invention are not
limited in these respects.
[0052] The data transceiver 12 may comprise a physical media
dependent (PMD) section (not shown) for transmitting data to and
receiving data from the transmission medium 32 according to a
physical layer data transmission protocol such as Gigabit Ethernet
over unshielded twisted wire pair cabling (or 1000BASE-T) or 10
Gigabit Ethernet over unshielded twisted wire pair cabling (or
10GBASE-T). For example, the PMD section may comprise circuitry to
detect individual bits in Ethernet data frames received from the
transmission medium 32 (e.g., clock and data recovery circuitry)
and circuitry to transmit individual bits in Ethernet data frames
transmitted to the node 34. The data transceiver 12 may also
comprise circuitry (not shown) to encode eight bit bytes making up
Ethernet data frames received from the transmission medium 32 (via
the PMD section) into ten bit code groups for transmission to the
controller 18 on differential pair signal 14 as a serial data
signal. The data transceiver 12 may encode the eight bit bytes into
ten bit code groups (e.g., 8 B/10 B code groups) as described in
IEEE 802.3-2002, Clause 36. Similarly, the data transceiver 12 may
comprise circuitry to decode 8 B/10 B code groups received from the
differential pair signal 16 into eight bit bytes for transmission
in the transmission medium 32 via the PMD section.
[0053] The controller 18 may comprise a deserializer 20 to recover
8 B/10 B code groups from the differential pair signal 14 and a
serializer 22 to transmit 8 B/10 B code groups to the data
transceiver 12 as a serial data signal over the differential pair
signal 16. A physical coding sublayer (PCS) section 18 may decode
the 8 B/10 B code groups recovered from the deserializer 20 to
reconstruct eight-bit bytes of Ethernet data frames received at the
data transceiver 12 from node 34. Similarly, the PCS section 18 may
encode eight-bit bytes of Ethernet data frames into 8 B/10 B code
groups for the serializer 22 to transmit to the data transceiver 12
in differential pair signal 16 (for transmission to node 34).
[0054] The PCS section 24 may be coupled to a media access control
(MAC) receive block 26 to provide Ethernet data frames reassembled
from eight-bit bytes decoded from 8 B/10 B code groups. The PCS
section 24 may also be coupled a MAC transmit block 28 to receive
Ethernet data frames for transmission through the transmission
medium 32. The MAC receive block 26 and MAC transmit block 28 may
be coupled at a signaling interface providing a Gigabit Media
Independent Interface (GMIT) as defined in IEEE Std. 802.3-2000,
Clause 36. However this is merely an example of how portions of a
MAC device may be coupled to a PCS section and embodiments of the
present invention are not limited in this respect.
[0055] The differential pair signals 14 and 16 may transmit
Ethernet data frames as 8 B/10 B code groups between the data
transceiver 12 and controller 18 as provided in IEEE Std.
802.3-2000, Clause 36.2.4. Such code groups used for the
transmission of Ethernet data frames may include, for example,
ordered code group sets for establishing bit and code group
synchronization, data code groups, idle code group (/I/), start of
packet delimiter code group (/S/), end of packet delimiter code
group (/T/), carrier extend code group (/R/) and error propagation
code group (/V/). In addition to transmitting 8 B/10 B code groups
in the differential pair signals 14 and 16 for the transmission of
Ethernet data frames, the controller 18 and data transceiver 12 may
transmit in-band control messages in the differential pair signals
14 and 16 along with encoded portions of Ethernet data frames. Such
in-band control messages may be transmitted as 8 B/10 B code groups
inserted among 8 B/10 B code groups transmitting encoded eight-bit
bytes of Ethernet data frames. By transmitting the control messages
in-band as 8 B/10 B code groups inserted among 8 B/10 B code groups
transmitted over the differential pair signals 14 and 16, control
messages which would otherwise be transmitted in a management data
input/output (MDIO) interface (either at the data transceiver 12 or
controller 18) may be transmitted as the inserted 8 B/10 B code
groups.
[0056] According to an embodiment, control messages may be inserted
among the 8 B/10 B code groups in differential pair signals 14 and
16 following an end of packet delimiter /T/ as a six byte (or code
group) sequence. When transmitting a control message, for example,
the code group sequence:
[0057] /T/R/K28.5/Dx.y/(six byte control message)/K28.5/Dx.y/
may be substituted for the typical code group sequence following an
Ethernet data frame:
[0058]
/T/R/K28.5/Dx.y/K28.5/Dx.y/K28.5/Dx.y/K28.5/Dx.y/K28.5/Dx.y/.
[0059] In this example, a six byte idle code group sequence
"/K28.5/Dx.y/K28.5/Dx.y/K28.5/Dx.y/" in the typical code group
sequence may be substituted with six bytes forming the control
message. A first byte of the six byte control message may include a
special symbol to indicate the presence of a control message (e.g.,
to access MDIO registers at the destination device) such as
"/K28.1/" including a comma. A second byte may specify read or
write access to specific MDIO registers. Third and fourth bytes may
specify information to be written to an MDIO register and a fifth
byte may be reserved. Finally, a sixth byte may include a cyclic
redundancy code for error correction (excluding the special symbol
/K28.1/). Similar six byte packets may be formatted for read access
acknowledge/response control messages or write access acknowledge
control messages. However, these are merely an example of how a
control message may be inserted among 8 b/10 B code groups for
transmitting Ethernet data frames and embodiments of the present
invention are not limited in these respects.
[0060] According to an embodiment, the PCS section 24 may comprise
circuitry 30 to detect 8 B/10 B code groups carrying in-band
control messages from among 8 B/10 B code groups received from
differential pair signal 14, and decode the control messages from
the detected 8 B/10 B code groups according to a predetermined
mapping of 8 B/10 B code groups to control messages. In response to
external signals (not shown), the circuitry 30 may encode control
messages for transmission to the data transceiver 12 as 8 B/10 B
code groups (e.g., inserted among 8 B/10 B code groups on
differential pair signal 16 containing Ethernet data frames)
according to the predetermined mapping of 8 B/10 B code groups to
control messages.
[0061] The data transceiver 12 may also comprise circuitry (not
shown) to detect 8 B/10 B code groups carrying in-band control
messages from among 8 B/10 B code groups received from differential
pair signal 16, and decode the control messages from the detected 8
B/10 B code groups according to the predetermined mapping of 8 B/10
B code groups to control messages. Similarly, the data transceiver
12 may also comprise circuitry to encode control messages for
transmission to the data controller 18 as 8 B/10 B code groups
(e.g., inserted among 8 B/10 B code groups on differential pair
signal 14 containing Ethernet data frames) according to the
predetermined mapping of 8 B/10 code groups to control
messages.
[0062] According to an embodiment, the data transceiver 12 and
controller 18 may support multiple Ethernet protocols at different
bit rates including 10 BASE-X (at 10 Mbps), 100 BASE-X (at 100
Mbps) and 1000 BASE-X (at 1000 Mbps). In addition, the data
transceiver 12 and controller 18 may support an autonegotiation
feature to select a data transmission protocol for use between the
data transceiver 12 and the node 34 for transmitting Ethernet data
frames in the transmission medium 32 as provided in IEEE Std.
802.3-2000, Clause 28. Accordingly, the data transceiver 12 may be
capable of negotiating with the node 34 to select the data
transmission protocol having the highest data rate from among
common data transmission protocols (e.g., 10 BASE-X, 100 BASE-X,
1000BASE-X or 10GBASE-X). Following negotiation between the data
transceiver 12 and the node 34 to the common data transmission
protocol having the highest data rate, the controller 18 may
communicate with the node 34 to identify and negotiate additional
capabilities (e.g., abilities to transmit in full or half duplex
modes) while communicating according to the selected data
transmission protocol as provided in IEEE Std. 802.3-2000, Clause
37.
[0063] Among control messages that may be transmitted from the data
transceiver 12 to the controller 18 in 8 B/10 B code groups over
the differential pair signal 14, the data transceiver 12 may
transmit one or more control messages to the controller 18
indicating a data transmission protocol or data rate selected
through autonegotiation, or status of the data link between the
data transceiver 12 and the node 34 (e.g., active versus inactive,
connected versus unconnected, changes in data transmission mode
from 10 Gbps to 1 Gbps, etc.). In response to receipt of either of
these control messages, the controller 18 my respond by
transmitting an acknowledgement in one or more 8 B/10 B code groups
over the differential pair signal 16. However, these are merely
examples of control messages that may be transmitted from a data
transceiver to a controller in 8 B/10 B code groups over a
differential pair signal and embodiments of the present invention
are not limited in these respects.
[0064] Using control messages transmitted as 8 B/10 B code groups
in the differential pair signal 14 and in response to data rate
selected from autonegotiation with the node 34, the data
transceiver 12 and controller 18 may configure the data rate of the
differenitial pair signals 14 and 16 according to the selected data
rate. For example, if the data rate selected through
autonegotiation is 1000 Mbps (e.g., from a selected 1000BASE-X
protocol), the data transceiver 12 and controller 18 may configure
the differential pair signals 14 and 16 to transmit at a data rate
of 1.25 Gbps. (allowing 250 Mbps of overhead for transmitting 8
B/10 B code groups encoded from eight-bit bytes of Ethernet data
frames). For a selected data rate of 10 or 100 Mbps, the data
transceiver 12 and controller 18 may transmit duplicate Ethernet
data frames or code groups in differential pair signals 14 and 16
transmitting at 1.25 Gbps. Alternatively, if the data rate selected
through autonegotiation is 10 or 100 Mbps (e.g., from a selected 10
BASE-X or 100BASE-X protocol), the data transceiver 12 and
controller 18 may configure the differential pair signals 14 and 16
at a data rate of 125 Mbps. Transmitting differential pair signals
14 and 16 at the lower data rate of 125 Mbps may enable the data
transceiver 12 and controller 18 to operate at lower power (over
transmitting at the higher 1.25 Gbps. data rate).
[0065] According to an embodiment, the controller 18 may be
included as part of a computing platform and coupled to a host
processing system (e.g., including a host processor, I/O core logic
and system memory) hosting an operating system and/or application
programs. As such, the computing platform may define certain states
and events such as, for example, a software reset event, power
states (e.g., full power, standby, snooze, etc.) and events
indicating a transition between power states. Among control
messages that may be transmitted from the controller 18 to the data
transceiver 12 in 8 B/10 B code groups over the differential pair
signal 16, the controller 18 may transmit control messages
indicating a change in the power state of the computing platform
(e.g., change from full power to standby or snooze, or from standby
or snooze to resume operation full power) enabling the data
transceiver to operate at low voltage when the computing platform
is not operating at a full power state. However, these are merely
examples of control messages that may be transmitted from a
controller to a data transceiver in 8 B/10 B code groups over a
differential pair signal and embodiments of the present invention
are not limited in these respects.
[0066] According to an embodiment, the controller 18 may perform
code group and bit synchronization in response to the differential
pair signal 14 to ensure the alignment of 8 B/10 B code groups from
the data transceiver 12. Similarly, the data transceiver 12 may
also perform code group and bit synchronization in response to the
differential pair signal 16 to ensure alignment of 8 B/10 B code
groups from the controller 18. The controller 18 and data
transceiver 12 may perform this code group and bit synchronization
as provided in IEEE Std. 802.3-2000, Clauses 36.2.4 and 36.2.5.2.6
to ensure synchronization of multi-code group ordered sets to code
group boundaries. However, these are merely examples of how code
group and bit synchronization may be established and embodiments of
the present invention are not limited in these respects.
[0067] While transmitting control messages between the controller
18 and data transceiver 12 as in-band 8 B/10 B code groups and
achieving code group and bit synchronization from detection of the
received code groups, the controller 18 and data transceiver 12
need only communicate with each other through four device pins
(i.e., four device pins on each device to enable transmission of
the differential pair signals 14 and 16 between the data
transceiver 12 and controller 18). For example, the use of separate
pins for an MDIO interface may be avoided by transmitting control
messages in-band over the differential pair signals 14 and 16.
[0068] The differential pair signals 14 and 16 may be transmitted
in a DDI extending thirty inches or more over a circuit board
coupling the data transceiver 12 and controller 18 to the DDI.
According to an embodiment, the system 10 may be provided on a line
card in a switch, router or other platform that may be used for
forwarding the contents of an Ethernet data frame from the node 34
and another node. The system 10 may provide a single port among
multiple ports coupled by switching circuitry (e.g., switch fabric
or Ethernet switch, not shown) to forward data frames from a source
port (or ingress port) to a destination port (or egress port). For
example, the MAC receive block 26 and MAC transmit block 28 may be
coupled to the switching circuitry to forward the contents of
frames to, and receive frames from, other ports. Also, the MAC
receive block 26 or MAC transmit block 28 may be coupled to network
processing devices (e.g., network processor, packet processing ASIC
or other device for performing packet classification, protocol
processing, intrusion detection, etc.). However, these are merely
examples of applications of a line card and embodiments of the
present invention are not limited in these respects.
[0069] In an alternative embodiment, the system 10 may be provided
in a system board or motherboard including a host processor (e.g.,
microprocessor for hosting an operating system and applications)
and an I/O core logic chipset (e.g., system memory controller and
peripheral I/O controller, not shown). In this embodiment, the
controller 18 may be integrated with one or more portions of an 110
core logic chipset while the data transceiver 12 is located near a
physical port connection (e.g., cable connection) separated from
the I/O core logic chipset. The controller 18 may be coupled to a
multiplex data bus as defined in versions of the Peripheral
Components Interconnect (PCI) Local Bus Specification 2.3, PCI-X or
PCI-Express (e.g., coupled to a "switch" entity). The system board
or motherboard of the presently illustrated embodiment may be
combined with a system memory for storing machine-readable
instructions of an operating system or application programs to be
executed by the host processor. For example, the host processor and
system memory may host a device driver that defines buffer
locations in the system memory that are used to store data packets
received from the controller 18 in data frames or store data
packets to be transmitted by the controller as Ethernet data
frames. Additionally, the controller 18 may comprise a TCP/IP
offload engine (not shown) for performing TCP/IP protocol
processing on TCP/IP packets received in Ethernet data frames from
the node 34.
[0070] Particular embodiments described herein relate to the
transmission of 8 B/10 B code groups (e.g., including Ethernet data
frames and control messages) between the data transceiver 12 and
controller 18 in single differential pair signals 14 and 18. In
other embodiments, however, the 8 B/10 B code groups may be
transmitted between such a data transceiver and controller in
multiple differential pair signals. For example, such a data
transceiver and controller may be coupled by a DDI comprising a 10
Gigabit Attachment Unit Interface (XAUI) providing four
differential pair signals to transmit 8 B/10 B code groups from the
data transceiver to the controller and four differential pair
signals to transmit 8 B/10 B code groups from the controller to the
data transceiver. In this embodiment, the data transceiver and
controller may each comprise sixteen device pins for coupling to
the DDI, eight pins for transmitting 8 B/10 B code groups and eight
pins for receiving 8 B/10 B code groups. Accordingly, 8 B/10 B code
groups containing control messages may be inserted among 8 B/10 B
code groups (containing Ethernet data frames) transmitted between
the controller and data transceiver (in multiple differential pair
signals) to obviate the need for an out-of band channel for
transmitting the control messages between the data transceiver and
the controller.
[0071] In the description above, for the purposes of explanation,
numerous specific details are set forth in order to provide a
thorough understanding of the present invention. It will be
apparent, however, to one skilled in the art that the present
invention may be practiced without some of these specific details.
In other instances, well-known structures and devices are shown in
block diagram form.
[0072] The present invention may include various processes. The
processes of the present invention may be performed by hardware
components or may be embodied in machine-executable instructions,
which may be used to cause a general-purpose or special-purpose
processor or logic circuits programmed with the instructions to
perform the processes. Alternatively, the processes may be
performed by a combination of hardware and software.
[0073] Portions of the present invention may be provided as a
computer program product, which may include a machine-readable
medium having stored thereon instructions, which may be used to
program a computer (or other electronic devices) to perform a
process according to the present invention. The machine-readable
medium may include, but is not limited to, floppy diskettes,
optical disks, CD-ROMs (compact disk read-only memory), and
magneto-optical disks, ROMs (read-only memory), RAMs (random access
memory), EPROMs (erasable programmable read-only memory), EEPROMs
(electrically-erasable programmable read-only memory), magnet or
optical cards, flash memory, or other type of
media/machine-readable medium suitable for storing electronic
instructions. Moreover, the present invention may also be
downloaded as a computer program product, wherein the program may
be transferred from a remote computer to a requesting computer by
way of data signals embodied in a carrier wave or other propagation
medium via a communication link (e.g., a modem or network
connection).
[0074] Many of the methods are described in their most basic form,
but processes can be added to or deleted from any of the methods
and information can be added or subtracted from any of the
described messages without departing from the basic scope of the
present invention. It will be apparent to those skilled in the art
that many further modifications and adaptations can be made. The
particular embodiments are not provided to limit the invention but
to illustrate it. The scope of the present invention is not to be
determined by the specific examples provided above but only by the
claims below.
[0075] It should also be appreciated that reference throughout this
specification to "one embodiment" or "an embodiment" means that a
particular feature may be included in the practice of the
invention. Similarly, it should be appreciated that in the
foregoing description of exemplary embodiments of the invention,
various features of the invention are sometimes grouped together in
a single embodiment, figure, or description thereof for the purpose
of streamlining the disclosure and aiding in the understanding of
one or more of the various inventive aspects. This method of
disclosure, however, is not to be interpreted as reflecting an
intention that the claimed invention requires more features than
are expressly recited in each claim. Rather, as the following
claims reflect, inventive aspects lie in less than all features of
a single foregoing disclosed embodiment. Thus, the claims are
hereby expressly incorporated into this description, with each
claim standing on its own as a separate embodiment of this
invention.
* * * * *