U.S. patent application number 10/144658 was filed with the patent office on 2003-11-13 for method and apparatus for providing an integrated network of processors.
This patent application is currently assigned to Nvidia Corporation. Invention is credited to Alfieri, Robert A., Hicok, Gary D..
Application Number | 20030212735 10/144658 |
Document ID | / |
Family ID | 29400386 |
Filed Date | 2003-11-13 |
United States Patent
Application |
20030212735 |
Kind Code |
A1 |
Hicok, Gary D. ; et
al. |
November 13, 2003 |
Method and apparatus for providing an integrated network of
processors
Abstract
A novel network architecture that integrates the functions of an
internet protocol (IP) router into a network processing unit (NPU)
that resides in a host computer's chipset such that the host
computer's resources are perceived as separate network appliances.
The NPU appears logically separate from the host computer even
though, in one embodiment, it is sharing the same chip.
Inventors: |
Hicok, Gary D.; (Mesa,
AZ) ; Alfieri, Robert A.; (Chapel Hill, NC) |
Correspondence
Address: |
MOSER, PATTERSON & SHERIDAN L.L.P.
595 SHREWSBURY AVE
FIRST FLOOR
SHREWSBURY
NJ
07702
US
|
Assignee: |
Nvidia Corporation
|
Family ID: |
29400386 |
Appl. No.: |
10/144658 |
Filed: |
May 13, 2002 |
Current U.S.
Class: |
709/201 ;
709/250 |
Current CPC
Class: |
H04L 69/329 20130101;
H04L 67/1097 20130101; H04L 69/12 20130101 |
Class at
Publication: |
709/201 ;
709/250 |
International
Class: |
G06F 015/16 |
Claims
What is claimed is:
1. Method for providing a distributed network of processing units,
said method comprising: a) providing a network processing unit; and
b) providing at least one host, wherein said at least one host
comprises a central processing unit (CPU) and a plurality of
auxiliary processing units, wherein said central processing unit is
loaded with a host operating system and wherein said plurality of
auxiliary processing units bypass said host operating system and
communicate directly with each other via said network processing
unit.
2. The method of claim 1, wherein said plurality of auxiliary
processing units employ a network protocol to communicate with said
network processing unit.
3. The method of claim 2, wherein each of said plurality of
auxiliary processing units communicates with said network
processing unit via a media access controller (MAC).
4. The method of claim 2, wherein said network protocol is
Transmission Control Protocol/Internet Protocol (TCP/IP).
5. The method of claim 2, wherein said network protocol is User
Datagram Protocol (UDP).
6. The method of claim 1, wherein each of said plurality of
auxiliary processing units is perceived as a separate network
appliance.
7. The method of claim 1, wherein one of said plurality of
auxiliary processing units is an auxiliary storage processing
unit.
8. The method of claim 1, wherein one of said plurality of
auxiliary processing units is an auxiliary audio processing
unit.
9. The method of claim 1, wherein one of said plurality of
auxiliary processing units is an auxiliary graphics processing
unit.
10. The method of claim 1, wherein one of said plurality of
auxiliary processing units is an auxiliary video processing
unit.
11. The method of claim 1, wherein one of said plurality of
auxiliary processing units is an auxiliary physics processing
unit.
12. The method of claim 1, wherein said at least one host is a
virtual host.
13. The method of claim 1, wherein said at least one host comprises
a plurality of virtual hosts, where at least two of said plurality
of virtual hosts are loaded with a separate operating system.
14. The method of claim 13, wherein each of said plurality of
virtual hosts is capable of accessing said plurality of auxiliary
processing units via said network processing unit.
15. The method of claim 1, wherein said plurality of auxiliary
processing units comprise an auxiliary storage processing unit, an
auxiliary audio processing unit, an auxiliary graphics processing
unit, and an auxiliary video processing unit.
16. A distributed network of processing units, said network
comprising: a network processing unit; and at least one host,
wherein said at least one host comprises a central processing unit
(CPU) and a plurality of auxiliary processing units, wherein said
central processing unit is loaded with a host operating system and
wherein said plurality of auxiliary processing units bypass said
host operating system and communicate directly with each other via
said network processing unit.
17. The network of claim 16, wherein said plurality of auxiliary
processing units employ a network protocol to communicate with said
network processing unit.
18. The network of claim 17, wherein each of said plurality of
auxiliary processing units communicates with said network
processing unit via a media access controller (MAC).
19. The network of claim 17, wherein said network protocol is
Transmission Control Protocol/Internet Protocol (TCP/IP).
20. The network of claim 17, wherein said network protocol is User
Datagram Protocol (UDP).
21. The network of claim 16, wherein each of said plurality of
auxiliary processing units is perceived as a separate network
appliance.
22. The network of claim 16, wherein one of said plurality of
auxiliary processing units is an auxiliary storage processing
unit.
23. The network of claim 16, wherein one of said plurality of
auxiliary processing units is an auxiliary audio processing
unit.
24. The network of claim 16, wherein one of said plurality of
auxiliary processing units is an auxiliary graphics processing
unit.
25. The network of claim 16, wherein one of said plurality of
auxiliary processing units is an auxiliary video processing
unit.
26. The network of claim 16, wherein one of said plurality of
auxiliary processing units is an auxiliary physics processing
unit.
27. The network of claim 16, wherein said at least one host is a
virtual host.
28. The network of claim 16, wherein said at least one host
comprises a plurality of virtual hosts, where at least two of said
plurality of virtual hosts are loaded with a separate operating
system.
29. The network of claim 28, wherein each of said plurality of
virtual hosts is capable of accessing said plurality of auxiliary
processing units via said network processing unit.
30. The network of claim 16, wherein said plurality of auxiliary
processing units comprise an auxiliary storage processing unit, an
auxiliary audio processing unit, an auxiliary graphics processing
unit, and an auxiliary video processing unit.
31. The network of claim 16, wherein said network processing unit
is implemented on a chipset.
32. The network of claim 31, wherein at least one of said plurality
of auxiliary processing units is implemented on a chipset.
33. Method for providing a distributed network of processing units,
said method comprising: a) providing a network processing unit; b)
providing at least one host, wherein said at least one host
comprises a central processing unit (CPU) loaded with a host
operating system; and c) providing a plurality of auxiliary
processing units, wherein said plurality of auxiliary processing
units bypass said host operating system and communicate directly
with each other via said network processing unit.
34. The method of claim 33, wherein said plurality of auxiliary
processing units employ a network protocol to communicate with said
network processing unit.
35. The method of claim 34, wherein each of said plurality of
auxiliary processing units communicates with said network
processing unit via a media access controller (MAC).
36. The method of claim 34, wherein said network protocol is
Transmission Control Protocol/Internet Protocol (TCP/IP).
37. The method of claim 34, wherein said network protocol is User
Datagram Protocol (UDP).
38. The method of claim 33, wherein each of said plurality of
auxiliary processing units is perceived as a separate network
appliance.
39. The method of claim 33, wherein one of said plurality of
auxiliary processing units is an auxiliary storage processing
unit.
40. The method of claim 33, wherein one of said plurality of
auxiliary processing units is an auxiliary audio processing
unit.
41. The method of claim 33, wherein one of said plurality of
auxiliary processing units is an auxiliary graphics processing
unit.
42. The method of claim 33, wherein one of said plurality of
auxiliary processing units is an auxiliary video processing
unit.
43. The method of claim 33, wherein one of said plurality of
auxiliary processing units is an auxiliary physics processing
unit.
44. The method of claim 33, wherein said at least one host is a
virtual host.
45. The method of claim 33, wherein said at least one host
comprises a plurality of virtual hosts, where at least two of said
plurality of virtual hosts are loaded with a separate operating
system.
46. The method of claim 45, wherein each of said plurality of
virtual hosts is capable of accessing said plurality of auxiliary
processing units via said network processing unit.
47. A distributed network of processing units, said network
comprising: a network processing unit; at least one host, wherein
said at least one host comprises a central processing unit (CPU)
loaded with a host operating system; and a plurality of auxiliary
processing units, wherein said plurality of auxiliary processing
units bypass said host operating system and communicate directly
with each other via said network processing unit.
48. The network of claim 47, wherein said plurality of auxiliary
processing units employ a network protocol to communicate with said
network processing unit.
49. The network of claim 48, wherein each of said plurality of
auxiliary processing units communicates with said network
processing unit via a media access controller (MAC).
50. The network of claim 48, wherein said network protocol is
Transmission Control Protocol/Internet Protocol (TCP/IP).
51. The network of claim 48, wherein said network protocol is User
Datagram Protocol (UDP).
52. The network of claim 37, wherein each of said plurality of
auxiliary processing units is perceived as a separate network
appliance.
53. The network of claim 47, wherein one of said plurality of
auxiliary processing units is an auxiliary storage processing
unit.
54. The network of claim 47, wherein one of said plurality of
auxiliary processing units is an auxiliary audio processing
unit.
55. The network of claim 47, wherein one of said plurality of
auxiliary processing units is an auxiliary graphics processing
unit.
56. The network of claim 47, wherein one of said plurality of
auxiliary processing units is an auxiliary video processing
unit.
57. The network of claim 47, wherein one of said plurality of
auxiliary processing units is an auxiliary physics processing
unit.
58. The network of claim 47, wherein said at least one host is a
virtual host.
59. The network of claim 47, wherein said at least one host
comprises a plurality of virtual hosts, where at least two of said
plurality of virtual hosts are loaded with a separate operating
system.
60. The network of claim 59, wherein each of said plurality of
virtual hosts is capable of accessing said plurality of auxiliary
processing units via said network processing unit.
61. The network of claim 47, wherein said network processing unit
is implemented on a chipset.
62. The network of claim 61, wherein at least one of said plurality
of auxiliary processing units is implemented on a chipset.
63. Method for providing a distributed network of processing units
and host resources, said method comprising: a) providing a network
processing unit; and b) providing at least one host, wherein said
at least one host comprises a central processing unit (CPU) loaded
with a host operating system and a plurality of host resources,
wherein each of said plurality of host resources is accessible
directly by said central processing unit and via said network
processing unit.
64. The method of claim 63, wherein one of said plurality of host
resources is a storage device.
65. The method of claim 63, wherein one of said plurality of host
resources is a read only memory (ROM).
66. The method of claim 63, wherein one of said plurality of host
resources is a random access memory (RAM).
67. A distributed network of processing units and host resources,
said network comprising: a network processing unit; and at least
one host, wherein said at least one host comprises a central
processing unit (CPU) loaded with a host operating system and a
plurality of host resources, wherein each of said plurality of host
resources is accessible directly by said central processing unit
and via said network processing unit.
68. The network of claim 67, wherein one of said plurality of host
resources is a storage device.
69. The network of claim 67, wherein one of said plurality of host
resources is a read only memory (ROM).
70. The network of claim 67, wherein one of said plurality of host
resources is a random access memory (RAM).
71. Method for providing a distributed network of processing units
and host resources, said method comprising: a) providing a first
network processing unit; b) providing a first host comprising a
first central processing unit (CPU) loaded with a first host
operating system and a plurality of first host resources; c)
providing a second network processing unit; and d) providing a
second host comprising a second central processing unit (CPU)
loaded with a second host operating system and a plurality of
second host resources, wherein each of said plurality of first host
resources is accessible via said first and second network
processing units by bypassing said first host operating system.
72. The method of claim 71, wherein each of said plurality of
second host resources is accessible via said first and second
network processing units by bypassing said second host operating
system.
73. The method of claim 71, further comprising: e) forwarding a
media stream in real time from one of said plurality of first host
resources to said second host operating system.
74. The method of claim 71, wherein said plurality of first host
resources comprise at least one auxiliary processing unit.
75. The method of claim 74, wherein said at least one auxiliary
processing unit is an auxiliary storage processing unit.
76. The method of claim 74, wherein said at least one auxiliary
processing unit is an auxiliary audio processing unit.
77. The method of claim 74, wherein said at least one auxiliary
processing unit is an auxiliary graphics processing unit.
78. The method of claim 74, wherein said at least one auxiliary
processing unit is an auxiliary video processing unit.
79. The method of claim 74, wherein said at least one auxiliary
processing unit is an auxiliary physics processing unit.
80. A distributed network of processing units and host resources,
said network comprising: a first network processing unit; a first
host comprising a first central processing unit (CPU) loaded with a
first host operating system and a plurality of first host
resources; a second network processing unit; and a second host
comprising a second central processing unit (CPU) loaded with a
second host operating system and a plurality of second host
resources, wherein each of said plurality of first host resources
is accessible via said first and second network processing units by
bypassing said first host operating system.
81. The network of claim 80, wherein each of said plurality of
second host resources is accessible via said first and second
network processing units by bypassing said second host operating
system.
82. The network of claim 80, wherein one of said plurality of first
host resources forwards a media stream in real time to said second
host operating system.
83. The network of claim 80, wherein said plurality of first host
resources comprise at least one auxiliary processing unit.
84. The network of claim 83, wherein said at least one auxiliary
processing unit is an auxiliary storage processing unit.
85. The network of claim 83, wherein said at least one auxiliary
processing unit is an auxiliary audio processing unit.
86. The network of claim 83, wherein said at least one auxiliary
processing unit is an auxiliary graphics processing unit.
87. The network of claim 83, wherein said at least one auxiliary
processing unit is an auxiliary video processing unit.
88. The network of claim 83, wherein said at least one auxiliary
processing unit is an auxiliary physics processing unit.
89. Method for providing a distributed network of processing units,
said method comprising: a) providing a network processing unit; and
b) providing at least one host, wherein said at least one host
comprises a central processing unit (CPU) and at least one
auxiliary processing unit, wherein said central processing unit is
loaded with a host operating system and wherein said at least one
auxiliary processing unit bypasses said host operating system and
communicates directly with said network processing unit.
90. The method of claim 89, wherein said at least one auxiliary
processing unit employs a network protocol to communicate with said
network processing unit.
91. The method of claim 90, wherein said at least one auxiliary
processing unit communicates with said network processing unit via
a media access controller (MAC).
92. A distributed network of processing units, said network
comprising: a network processing unit; and at least one host,
wherein said at least one host comprises a central processing unit
(CPU) and at least one auxiliary processing unit, wherein said
central processing unit is loaded with a host operating system and
wherein said at least one auxiliary processing unit bypasses said
host operating system and communicates directly with said network
processing unit.
93. The network of claim 92, wherein said at least one auxiliary
processing unit employs a network protocol to communicate with said
network processing unit.
94. The network of claim 93, wherein said at least one auxiliary
processing unit communicates with said network processing unit via
a media access controller (MAC).
95. Method for providing a distributed network of processing units
for interacting with at least one host that comprises a central
processing unit (CPU) and, wherein said central processing unit is
loaded with a host operating system, said method comprising: a)
providing a network processing unit; and b) providing at least one
auxiliary processing unit, wherein said network processing unit and
said at least one auxiliary processing unit bypass said host
operating system and communicate directly with each other.
96. The method of claim 95, wherein said at least one auxiliary
processing unit comprises two auxiliary processing units that
bypass said host operating system and communicate directly with
each other through said network processing unit.
97. The method of claim 95, wherein said at least one auxiliary
processing unit employs a network protocol to communicate with said
network processing unit.
98. The method of claim 97, wherein said at least one auxiliary
processing unit communicates with said network processing unit via
a media access controller (MAC).
99. The method of claim 97, wherein said network protocol is
Transmission Control Protocol/Internet Protocol (TCP/IP).
100. The method of claim 97, wherein said network protocol is User
Datagram Protocol (UDP).
101. The method of claim 95, wherein said at least one auxiliary
processing unit is perceived as a separate network appliance.
102. The method of claim 95, wherein said at least one auxiliary
processing unit is an auxiliary storage processing unit.
103. The method of claim 95, wherein said at least one auxiliary
processing unit is an auxiliary audio processing unit.
104. The method of claim 95, wherein said at least one auxiliary
processing unit is an auxiliary graphics processing unit.
105. The method of claim 95, wherein said at least one auxiliary
processing unit is an auxiliary video processing unit.
106. The method of claim 95, wherein said at least one auxiliary
processing unit is an auxiliary physics processing unit.
107. The method of claim 95, wherein said at least one host is a
virtual host.
108. The method of claim 95, wherein said at least one host
comprises a plurality of virtual hosts, where at least two of said
plurality of virtual hosts are loaded with a separate operating
system.
109. The method of claim 108, wherein each of said plurality of
virtual hosts is capable of accessing said plurality of auxiliary
processing units via said network processing unit.
110. The method of claim 95, wherein said at least one auxiliary
processing unit comprises an auxiliary storage processing unit, an
auxiliary audio processing unit, an auxiliary graphics processing
unit, and an auxiliary video processing unit.
111. A distributed network of processing units for interacting with
at least one host that comprises a central processing unit (CPU)
and, wherein said central processing unit is loaded with a host
operating system, said network comprising: a network processing
unit; and at least one auxiliary processing unit, wherein said
network processing unit and said at least one auxiliary processing
unit bypass said host operating system and communicate directly
with each other.
112. The network of claim 111, wherein said at least one auxiliary
processing unit comprises two auxiliary processing units that
bypass said host operating system and communicate directly with
each other through said network processing unit.
113. The network of claim 111, wherein said at least one auxiliary
processing unit employs a network protocol to communicate with said
network processing unit.
114. The network of claim 113, wherein said at least one auxiliary
processing unit communicates with said network processing unit via
a media access controller (MAC).
115. The network of claim 113, wherein said network protocol is
Transmission Control Protocol/Internet Protocol (TCP/IP).
116. The network of claim 113, wherein said network protocol is
User Datagram Protocol (UDP).
117. The network of claim 111, wherein said at least one auxiliary
processing unit is perceived as a separate network appliance.
118. The network of claim 111, wherein said at least one auxiliary
processing unit is an auxiliary storage processing unit.
119. The network of claim 111, wherein said at least one auxiliary
processing unit is an auxiliary audio processing unit.
120. The network of claim 111, wherein said at least one auxiliary
processing unit is an auxiliary graphics processing unit.
121. The network of claim 111, wherein said at least one auxiliary
processing unit is an auxiliary video processing unit.
122. The network of claim 111, wherein said at least one auxiliary
processing unit is an auxiliary physics processing unit.
123. The network of claim 111, wherein said at least one host is a
virtual host.
124. The network of claim 111, wherein said at least one host
comprises a plurality of virtual hosts, where at least two of said
plurality of virtual hosts are loaded with a separate operating
system.
125. The network of claim 124, wherein each of said plurality of
virtual hosts is capable of accessing said at least one auxiliary
processing unit via said network processing unit.
126. The network of claim 111, wherein said at least one auxiliary
processing unit comprises an auxiliary storage processing unit, an
auxiliary audio processing unit, an auxiliary graphics processing
unit, and an auxiliary video processing unit.
127. The network of claim 111, wherein said network processing unit
is implemented on a chipset.
128. The network of claim 127, wherein said at least one of said
plurality of auxiliary processing units is implemented on a
chipset.
129. Method for providing a distributed network of processing
units, said method comprising: a) providing a network processing
unit; and b) providing at least one host, wherein said at least one
host comprises a central processing unit (CPU) and a plurality of
auxiliary processing units, wherein said central processing unit is
loaded with a host operating system and wherein said plurality of
auxiliary processing units bypass said host operating system and
communicate directly with each other via said network processing
unit, wherein said plurality of auxiliary processing units employ
Transmission Control Protocol/Internet Protocol (TCP/IP) to
communicate with said network processing unit.
130. Method for providing a distributed network of processing
units, said method comprising: a) providing a network processing
unit; and b) providing at least one host, wherein said at least one
host comprises a central processing unit (CPU) and a plurality of
auxiliary processing units, wherein said central processing unit is
loaded with a host operating system and wherein said plurality of
auxiliary processing units bypass said host operating system and
communicate directly with each other via said network processing
unit, wherein said plurality of auxiliary processing units employ
User Datagram Protocol (UDP) to communicate with said network
processing unit.
131. A distributed network of processing units, said network
comprising: a network processing unit; and at least one host,
wherein said at least one host comprises a central processing unit
(CPU) and a plurality of auxiliary processing units, wherein said
central processing unit is loaded with a host operating system and
wherein said plurality of auxiliary processing units bypass said
host operating system and communicate directly with each other via
said network processing unit, wherein said plurality of auxiliary
processing units employ Transmission Control Protocol/Internet
Protocol (TCP/IP) to communicate with said network processing
unit.
132. A distributed network of processing units, said network
comprising: a network processing unit; and at least one host,
wherein said at least one host comprises a central processing unit
(CPU) and a plurality of auxiliary processing units, wherein said
central processing unit is loaded with a host operating system and
wherein said plurality of auxiliary processing units bypass said
host operating system and communicate directly with each other via
said network processing unit, wherein said plurality of auxiliary
processing units employ User Datagram Protocol (UDP) to communicate
with said network processing unit.
Description
[0001] The present invention relates to a novel network
architecture. More specifically, the present invention integrates
the functions of an internet protocol (IP) router into a network
processing unit that resides in a host computer's chipset such that
the host computer's resources are perceived as separate network
appliances.
BACKGROUND OF THE DISCLOSURE
[0002] FIG. 1 illustrates traditional internal content sources and
data pipes where the data routing function is performed by a host
central processing unit (CPU) and its operating system (OS) 110.
Namely, the host computer may comprise a number of storage devices
120, a plurality of media engines 130, and a plurality of other
devices that are accessible via input/output ports 140, e.g.,
universal serial bus (USB) and the like. In turn, the host computer
may access a network 150 via application programming interfaces
(APIs) and a media access controller (MAC).
[0003] However, a significant drawback of this data routing
architecture is that the host computer's resources or devices are
only accessible with the involvement of the host CPU/OS. Typically,
accessing the host resources from external computers is either
prohibited or it is necessary to request access through the host
computer using high-level protocols. If the host CPU/OS is
overtaxed, a substantial latency will exist where data flow may be
stuck in the OS stacks.
[0004] Therefore, a need exists for a novel network architecture
that allows a host computer's resources to be perceived as separate
network appliances and are accessible without the interference of
the host computer's CPU/OS.
SUMMARY OF THE INVENTION
[0005] The present invention is a novel network architecture. More
specifically, the present invention integrates the functions of an
internet protocol (IP) router into a network processing unit (NPU)
that resides in a host computer's chipset such that the host
computer's resources are perceived as separate network appliances.
The NPU appears logically separate from the host computer even
though, in one embodiment, it is sharing the same chip. A host
computer's "chipset" is one or more integrated circuits coupled to
a CPU that provide various interfaces (e.g., main memory, hard
disks, floppy, USB, PCI, etc), exemplified by Intel's Northbridge
and Southbridge integrated circuits.
[0006] In operation, the host computer has a virtual port (i.e.,
host MAC) that is in communication with the network processing unit
and communicates with the NPU as if it is an external network
appliance using standard networking protocols. In one embodiment,
the host computer communicates via the NPU with one or more
auxiliary or dedicated processing units that are deployed to
perform dedicated tasks. These auxiliary processing units can be
part of the host or can be deployed separate from the host to meet
different application requirements. For example, some of these
auxiliary processing units include, but are not limited to, a
graphics processing unit (GPU), an audio processing unit (APU), a
video processing unit (VPU), a storage processing unit (SPU), and a
physics processing unit (PPU). The present disclosure refers to
these auxiliary processing units as XPU, where the "X" is replaced
to signify a particular function performed by the processing unit.
Finally, the network processing unit itself is an XPU because it
can, in addition to routing packets among XPUs, perform various
processing accelerations on these packets, such as authentication,
encryption, compression, TCP, IPSec/VPN/PPP encapsulation and so
on.
[0007] One unique aspect of the present Invention is that the XPUs
have logically direct attachments to the NPU which effectively
serves as an integrated router, thereby allowing XPUs to be seen as
separate network appliances. Since these auxiliary processing units
have first-class status in this logical network architecture, they
are allowed to communicate with each other or with any external
computer (e.g., via another NPU) directly using standard internet
protocols such as IP, TCP, UDP and the like without the involvement
of the host CPU/OS. Using this novel architecture, the NPU provides
both local (or host) access and remote access acceleration in a
distributed computing environment.
[0008] Furthermore, by virtualizing the remaining resources of the
host computer, such as its physical memory, ROM, real-time clocks,
interrupts, and the like, the present invention allows a single
chipset to provide multiple, virtual host computers with each being
attached to this NPU. Each of these virtual computers or virtual
host may run its own copy of an identical or different operating
system, and may communicate with other virtual computers and
integrated networked appliances using standard networking
protocols. Effectively, the present invention embodies its own
hardware-level operating system and graphical user interface (GUI)
that reside below the standard host operating system and host
computer definition, and allow the computer user to easily
configure the network or to switch from one virtual computer to
another without changing the standard definition of that host
computer.
BRIEF DESCRIPTION OF THE DRAWINGS
[0009] The teachings of the present invention can be readily
understood by considering the following detailed description in
conjunction with the accompanying drawings, in which:
[0010] FIG. 1 illustrates a block diagram of conventional internal
content sources and data pipes;
[0011] FIG. 2 illustrates a block diagram of novel internal content
sources and data pipes of the present invention;
[0012] FIG. 3 illustrates a block diagram where a network of host
computers are in communication with each other via a plurality of
network processing units;
[0013] FIG. 4 illustrates a block diagram where a host computer's
resources are networked via a network processing unit of the
present invention; and
[0014] FIG. 5 illustrates a block diagram of a network of virtual
personal computers in communication with a network processing unit
of the present invention.
[0015] To facilitate understanding, identical reference numerals
have been used, where possible, to designate identical elements
that are common to the figures.
DETAILED DESCRIPTION
[0016] FIG. 2 illustrates a block diagram of novel internal content
sources and data pipes 200 of the present invention. Unlike FIG. 1,
the present network architecture has a network processing unit 210
of the present invention at the center of the internal content
sources and data pipes. The host CPU/OS 250 is no longer central to
the data routing scheme. One advantage of this new architecture is
that the NPU 210 provides both local or host access and remote
access acceleration.
[0017] An operating system is any software platform for application
programs; typical examples are Microsoft Windows, Unix, and Apple
Macintosh OS. An operating system can be run on top of another
operating system (an example of a virtual operating system) or
another underlying software platform, possibly as an application
program.
[0018] In operation, the host CPU/OS 250 has a virtual port (i.e.,
host MAC) that is in communication with the network processing unit
210 and communicates with the NPU as if it is an external network
appliance using standard networking protocols, e.g., TCP/IP
protocols. In one embodiment, the host computer communicates via
the NPU with one or more auxiliary or dedicated processing units
220, 230 that are deployed to perform dedicated tasks. These
auxiliary processing units can be part of the host or can be
deployed separate from the host to meet different application
requirements.
[0019] For example, some of these auxiliary processing units
include, but are not limited to, a graphics processing unit (GPU),
an audio processing unit (APU), a video processing unit (VPU), a
physics processing unit (PPU) and a storage processing unit (SPU)
220. Some of these auxiliary processing units can be deployed as
part of the media engines 230, whereas the SPU 220 is deployed with
the storage devices of the host. Finally, the network processing
unit itself is an XPU because it can, in addition to routing
packets among XPUs, perform various processing accelerations on
these packets, such as authentication, encryption, compression,
TCP, IPSec/VPN/PPP encapsulation and so on.
[0020] In one embodiment, the NPU 210 is a network router appliance
that resides inside the same "box" or chassis as the host computer
250, i.e., typically within the same chipset. The NPU serves to
connect various other "XPUs" that performed dedicated functions
such as:
[0021] 1) Storage Processing Unit (SPU) is an auxiliary processing
unit that implements a file system, where the file system can be
accessed locally by the host or remotely via the NPU's connection
to the outside world. The SPU is a special XPU because it behaves
as an endpoint for data storage. Streams can originate from an SPU
file or terminate at an SPU file.
[0022] 2) Audio Processing Unit (APU) is an auxiliary processing
unit that implements audio affects on individual "voices" and mixes
them down to a small number of channels. APU also performs
encapsulation/decapsulation of audio packets that are
transmitted/received over the network via the NPU.
[0023] 3) Video Processing Unit (VPU) is an auxiliary processing
unit that is similar to the APU except that it operates on
compressed video packets (e.g., MPEG-2 compressed), either
compressing them or uncompressing them. The VPU also performs
encapsulations into bitstreams or network video packets.
[0024] 4) Graphics Processing Unit (GPU) is an auxiliary processing
unit that takes graphics primitives and produces (partial) frame
buffers. The GPU is a special XPU because it acts as an endpoint
for rendered graphics primitives. Streams can terminate at a GPU
frame buffer or originate as raw pixels from a frame buffer.
[0025] 5) Physics Processing Unit (PPU) is an auxiliary processing
unit that takes object positions, current velocity vectors, and
force equations, and produces new positions, velocity vectors, and
collision information.
[0026] 6) Network Processing Unit (NPU) is itself an XPU because it
can, in addition to routing packets among XPUs, perform various
processing accelerations on these packets, such as authentication,
encryption, compression, TCP, IPSec/VPN/PPP encapsulation and the
like.
[0027] Some of the above XPUs have a number of commonalities with
respect to their association with the host 250 and the NPU 210.
First, an XPU can be accessed directly by the host CPU and O/S 250
directly as a local resource. Namely, communication is effected by
using direct local communication channels.
[0028] Second, an XPU can be placed on the network via the NPU and
accessed remotely from other network nodes (as shown in FIG. 3
below). This indicates that an XPU is capable of processing
information that is encapsulated in network packets.
[0029] Third, an XPU can be accessed as a "remote" node even from
the local host. Namely, communication is effected via the NPU by
using network protocols.
[0030] Fourth, an XPU is always in an "on" state (like most
appliances) even when the host (CPU+O/S) is in the "off" state.
This unique feature allows the XPUs to operate without the
involvement of the host CPU/OS, e.g., extracting data from a disk
drive of the host without the involvement of the host. More
importantly, the host's resources are still available even though
the CPU/OS may be in a dormant state, e.g., in a sleep mode.
[0031] Fifth, an XPU has at least two sets of processing queues,
one for non-real-time packets and at least one for real-time
packets. This duality of queues combined with similar real-time
queues in the NPU, allows the system of NPU and XPUs to guarantee
latencies and bandwidth for real-time streams.
[0032] Sixth, an XPU has two software (SW) drivers, one that
manages the host-side connection to the XPU, and one that manages
the remotely-accessed component of the XPU. In operation, the SW
drivers communicate with the XPU using abstract command queues,
called push buffers (PBs). Each driver has at least one PB going
from the driver to the XPU and at least one PB going from the XPU
to the driver. Push buffers are described in U.S. Pat. No.
6,092,124, and is herein incorporated herein by reference.
[0033] Seventh, an XPU can also be accessed on the host side
directly by a user-level application. Namely, this involves
lazy-pinning of user-space buffers by the O/S. Lazy-pinning means
to lock the virtual-to-physical address translations of memory
pages on demand, i.e., when the translations are needed by the
particular XPU. When the translations are no longer needed, they
can be unlocked, allowing the operating system to page out those
pages. The virtual-to-physical mappings of these buffers are passed
to the XPU. A separate pair of PBs are linked into the user's
address space and the O/S driver coordinates context switches with
the XPU.
[0034] Although the present invention discloses the use of a
network processing unit 210 to perform routing functions without
the involvement of the CPU/OS, the CPU/OS 250 nevertheless still
has an alternate direct communication channel 255 with its
resources, e.g., storage devices. This provides the host CPU/OS
with the option of communicating with its resources or media
engines via the NPU or directly via local access channels 255 or
257.
[0035] In fact, although the CPU/OS is not involved with the
general routing function, in one embodiment of the present
invention, exception routing issues are resolved by the host
CPU/OS. For example, if the NPU receives a packet that it is unable
to process, the NPU will forward the packet to the host CPU/OS for
resolution. This limited use of the CPU/OS serves to accelerate
host processing, while retaining the option to more judiciously use
the processing power of the host CPU/OS to resolve difficult
issues.
[0036] Additionally, the host resources may also be accessed via
the NPU without the involvement of the host CPU/OS 250 via
input/output communication channel 240, e.g., via an USB. For
example, the present architecture can virtualize the remaining
resources of the host computer 250, such as its physical memory,
read only memory (ROM), real-time clocks, interrupts, and so on,
thereby allowing a single chipset to provide multiple virtual hosts
with each host being attached to the NPU 210.
[0037] One unique aspect of the present Invention is that the XPUs
have logically direct attachments to the NPU that effectively
serves as an integrated router, thereby allowing XPUs to be seen as
separate network appliances. Since these auxiliary processing units
have first-class status in this logical network architecture, they
are allowed to communicate with each other or with any external
computer (e.g., via another NPU) directly using standard internet
protocols such as IP, TCP, UDP and the like without the involvement
of the host CPU/OS. Using this novel architecture, the NPU provides
both local (or host) access and remote access acceleration in a
distributed computing environment.
[0038] FIG. 3 illustrates a block diagram where a network of host
computers 300a-n are in communication with each other via a
plurality of network processing units 310a-n. This unique
configuration provides both host access and remote access
acceleration. The accelerated functions can be best understood by
viewing the present invention in terms of packetized streams.
[0039] It is best to view this system of NPU and XPUs in the
context of streams of packetized data that flow within this system.
There are various types of streams that are allowed by the system.
In this discussion, the term "host" means the combination of host
CPU and memory in the context of the O/S kernel or a user-level
process. The term "node" refers to a remote networked host or
device that is attached to the NPU via a wired or wireless
connection to a MAC that is directly connected to the NPU (e.g., as
shown in FIG. 4 below).
[0040] A host-to-XPU stream is a stream that flows directly from
the host 350a to the XPU 330a. This is a typical scenario for a
dedicated XPU (e.g., a dedicated GPU via communication path 357).
The stream does not traverse through the NPU 310a.
[0041] An XPU-to-host stream is a stream that flows directly from
the XPU to the host. One example is a local file being read from
the SPU 320a via path 355. The stream does not traverse through the
NPU 310a.
[0042] A host-to-XPU-to-host stream is a stream that flows from
host 350a to an XPU 330a for processing then back to the host 350a.
One example is where the host forwards voice data directly to the
APU for processing of voices into final mix buffers that are
subsequently returned to the host via path 357. The stream does not
traverse through the NPU 310a.
[0043] A host-to-NPU-to-XPU stream is a networked stream that flows
from the host 350a via NPU 310a to an XPU 330a or 320a. The three
parties transfer packetized data using standard networking
protocols, e.g., TCP/IP.
[0044] An XPU-to-NPU-to-Host is a networked stream that flows from
an XPU 330a or 320a via the NPU 310a to the host 350a. The three
parties transfer packetized data using standard networking
protocols, e.g., TCP/IP.
[0045] A host-to-NPU-to-XPU-to-NPU-to-host is a networked stream
that is the combination of the previous two streams. The three
parties transfer packetized data using standard networking
protocols, e.g., TCP/IP.
[0046] A host-to-NPU-to-Node is a networked stream that flows from
the host 350a via the NPU 310a to a remote node (e.g., NPU 310b).
This allows a local host 350a to communicate and access XPUs 330b
of another host via a second NPU 310b.
[0047] A Node-to-NPU-to-Host is a reverse networked stream where
the stream flows from a remote node (e.g., NPU 310b) via the NPU
310a to the host 350a. This allows a remote NPU 350b to communicate
with a local host 350a via a local NPU 310a.
[0048] A Node-to-NPU-to-XPU is a networked stream that flows from a
remote node 350b via the NPU 350a to an XPU 330a where it
terminates. This allows a remote NPU 310b to communicate with a
local XPU 330a via a local NPU 310a.
[0049] An XPU-to-NPU-to-Node is a networked stream that flows from
an XPU 330a where it originates to a remote node (e.g., NPU 310b)
via local NPU 310a.
[0050] A Node0-to-NPU-to-XPU-to-NPU-to-Node1 is a combination of
the previous two streams. It should be noted that Node0 and Node1
may be the same or different. For example, Node0 is 310a; NPU is
310b; XPU is 330b; NPU is 310b; and Node1 is 310n. Alternatively,
Node0 is 310a; NPU is 310b; XPU is 330b; NPU is 310b; and Node1 is
310a.
[0051] A
{Host,Node0,XPU0}-to-NPU-to-XPU1-to-NPU-to-XPU2-to-NPU-to{Host,No-
de1,XPU3} is a stream that originates from the host, a remote node,
or an XPU, passes through the NPU to another XPU for some
processing, then passes through the NPU to another XPU for some
additional processing, then terminates at the host, another remote
node, or another XPU. It should be clear that the present
architecture of a network of integrated processing units provides a
powerful and flexible distributed processing environment, where
both host access and remote access acceleration are greatly
enhanced.
[0052] Under the present architecture, numerous advantages are
achieved. First, it is beneficial to tightly integrate other
computers and network appliances into the same chipset. Second, it
is very advantageous to offload a host computer's I/O functions
into a distributed network of intelligent processors, where
traditional latencies associated with overtaxed CPU/OS are
resolved. Third, it is advantageous to provide these auxiliary I/O
processors with first-class network-appliance status within the
chipset (optionally illustrated in FIG. 2 with dash lines) without
changing the definition of the host computer. Fourth, it is
advantageous to allow these auxiliary I/O processors to be shared
among the host computer, external computers, and internal and
external network appliances. Fifth, it is advantageous to allow the
remaining resources of the host computer to be virtualized so that
multiple virtual copies of the host computer may be embodied in the
same chipset, while sharing the network of intelligent auxiliary
I/O processors. Finally, it is advantageous to use a hardware-level
operating system and graphical user interface (GUI) that allow the
user to configure the network and seamlessly switch among virtual
copies of the host computer or virtual host.
[0053] In one embodiment of the present invention, real-time media
streaming is implemented using the above described network of
integrated processing units. Specifically, media streaming
typically involves multiple software layers. Thus, latencies can be
unpredictable, particularly when the software runs on a
general-purpose computer. More importantly, media streaming
typically has a severe adverse impact on other applications running
on the host computer.
[0054] However, by attaching media devices such as an APU or GPU to
an NPU+SPU combination, it is now possible to minimize and
guarantee latencies as well as offload the main host CPU. For
example, referring to FIG. 3, control requests may arrive from a
remote recipient 350b (typically attached wireless). These control
requests may include play, stop, rewind, forward, pause, select
title, and so on. Once the stream is set up, the raw data can be
streamed directly from a disk managed by the SPU 320a through the
NPU 310a to the destination client. Alternatively, the data may get
preprocessed by the GPU 330a or APU 330a prior to being sent out
via the NPU 310a. One important aspect again is that real-time
media streaming can take place without host CPU 350a involvement.
Dedicated queuing throughout the system will guarantee latencies
and bandwidth.
[0055] This media streaming embodiment clearly demonstrates the
power and flexibility of the present invention. One practical
implementation of this real-time media streaming embodiment is
within the home environment, where a centralized multimedia host
server or computer has a large storage device that contains a
library of stored media streams or it may simply be connected to a
DVD player, a "PVR" (personal video recorder) or "DVR" (digital
video recorder). If there are other client devices throughout the
home, it is efficient to use the above network architecture to
implement real-time media streaming, where a media stream from a
storage device of the host computer can be transmitted to another
host computer or a television set in a different part of the home.
Thus, the real-time media streaming is implemented without the
involvement of the host computer and with guaranteed latencies and
bandwidth.
[0056] FIG. 4 illustrates a block diagram where a host computer's
resources are networked via a network processing unit 410 of the
present invention. Specifically, a host 450 communicates with the
NPU 410 via a MAC 415 (i.e., a host MAC). In turn, a plurality of
XPUs and other host resources 430a are connected to the NPU via a
plurality of MACs 425 that interface with a MAC Interface (MI) (not
shown) of the NPU. One example of an NPU is disclosed in US patent
application entitled "A Method And Apparatus For Performing Network
Processing Functions" with attorney docket NVDA/P000413.
[0057] FIG. 5 illustrates a block diagram of a network of virtual
personal computers or virtual hosts that are in communication with
a network processing unit 520 of the present invention. More
specifically, FIG. 5 illustrates a network of virtual personal
computers (VPCs) in a single system (or a single chassis) 500,
where the system may be a single personal computer, a set top box,
a video game console or the like.
[0058] In operation, FIG. 5 illustrates a plurality of virtual
hosts 510a-e, which may comprise a plurality of different operating
systems (e.g., Microsoft Corporation's Windows (two separate copies
510a and 510b), and Linux 510c), a raw video game application 510d
or other raw applications 510e, where the virtual hosts treat the
storage processing unit 530 as a remote file server having a
physical storage 540. In essence, one can perceive FIG. 5 as
illustrating a "network of VPCs in a box".
[0059] In one embodiment, the NPU 520 manages multiple IP addresses
inside the system for each VPC. For example, the NPU 520 may be
assigned a public IP address, whereas each of the VPCs is assigned
a private IP address, e.g., in accordance with Dynamic Host
Configuration Protocol (DHCP). Thus, each of the VPCs can
communicate with each other and the SPU using standard networking
protocols. Standard network protocols include, but are not limited
to: TCP; TCP/IP; UDP; NFS; HTTP; SMTP; POP; FTP; NNTP; CGI; DHCP;
and ARP (to name only a few that are know in the art).
[0060] It should be understood that the XPUs of the present
invention can be implemented as one or more physical devices that
are coupled to the host CPU through a communication channel.
Alternatively, the XPUs can be represented and provided by one or
more software applications (or even a combination of software and
hardware, e.g., using application specific integrated circuits
(ASIC)), where the software is loaded from a storage medium, (e.g.,
a ROM, a magnetic or optical drive or diskette) and operated in the
memory of the computer. As such, the XPUs (including associated
methods and data structures) of the present invention can be stored
and provided on a computer readable medium, e.g., ROM or RAM
memory, magnetic or optical drive or diskette and the like.
Alternatively, the XPUs can be represented by Field Programmable
Gate Arrays (FPGA) having control bits.
[0061] Although various embodiments which incorporate the teachings
of the present invention have been shown and described in detail
herein, those skilled in the art can readily devise many other
varied embodiments that still incorporate these teachings. In the
claims, elements of method claims are listed in a particular order,
but no order for practicing of the invention is implied, even if
elements of the claims are numerically or alphabetically
enumerated.
* * * * *