U.S. patent number 8,266,238 [Application Number 11/646,083] was granted by the patent office on 2012-09-11 for memory mapped network access.
This patent grant is currently assigned to Intel Corporation. Invention is credited to Michael A. Rothman, Vincent J. Zimmer.
United States Patent |
8,266,238 |
Zimmer , et al. |
September 11, 2012 |
Memory mapped network access
Abstract
The present disclosure relates to memory access, and
specifically to memory access utilizing internet protocol (IP)
addressing semantics. Various embodiments, methods, apparatus and
systems are provided that allow a system to detect that a memory
access has been attempted involving a region of memory that is
mapped to a network device; and to perform the memory access
utilizing, at least in part, the networked device and a network
interface. Other embodiments may be described and claimed.
Inventors: |
Zimmer; Vincent J. (Federal
Way, WA), Rothman; Michael A. (Puyallup, WA) |
Assignee: |
Intel Corporation (Santa Clara,
CA)
|
Family
ID: |
39585564 |
Appl.
No.: |
11/646,083 |
Filed: |
December 27, 2006 |
Prior Publication Data
|
|
|
|
Document
Identifier |
Publication Date |
|
US 20080162680 A1 |
Jul 3, 2008 |
|
Current U.S.
Class: |
709/213; 709/223;
711/202; 711/6 |
Current CPC
Class: |
G06F
12/1081 (20130101); H04L 61/00 (20130101); H04L
41/046 (20130101); H04L 29/12009 (20130101) |
Current International
Class: |
G06F
15/167 (20060101) |
Field of
Search: |
;709/213
;711/202,203,206 |
References Cited
[Referenced By]
U.S. Patent Documents
Other References
Barham, et al., "Xen and the art of virtualization", Jan. 1, 2003,
pp. 164-177. cited by examiner .
Allyn Romanow, Stephen Bailey, "An Overview of RDMA over IP",
Internet Society 2002. cited by examiner .
Barham, et al., "Xen and the art of virtualization," pp. 164-177,
Jan. 1, 2003. cited by other .
Allyn Romanow, Stephen Bailey, "An Overview of RDMA over IP,"
Internet Society 2002. cited by other.
|
Primary Examiner: Blair; Douglas
Assistant Examiner: Survillo; Oleg
Attorney, Agent or Firm: Schwabe, Williamson & Wyatt,
P.C.
Claims
The invention claimed is:
1. A method comprising: mapping, by a virtual machine monitor of a
system, a plurality of virtual memory addresses within a region of
a virtual memory address space of a virtual machine to a plurality
of internet protocol (IP) addresses, wherein the plurality of
virtual memory addresses being mapped correspond to physical memory
locations beyond a range of physical memory addresses available on
the system, and wherein the virtual machine is disposed on the
system; trapping, by the virtual machine monitor, a memory read or
write access made by a guest operating system; determining, by the
virtual machine monitor, that the memory read or write access
occurs for a memory address that is greater than the range of
physical memory addresses available on the system, wherein the
memory read or write access involves one of the plurality of
virtual memory addresses within said region of the virtual memory
address space that is mapped to one of the plurality of IP
addresses; and transmitting, by the virtual machine monitor, a data
read or write request corresponding to the memory read or write
access to a network device associated with the one of the plurality
of IP addresses corresponding to the one of the plurality of the
virtual memory addresses.
2. The method of claim 1, wherein the memory access facilitates
inter-process communication.
3. The method of claim 1, wherein the range of physical memory
addresses available on the system includes 2.sup.44 addresses, and
the range of virtual memory addresses provided to the guest
operating system includes 2.sup.64 addresses.
4. The method of claim 1, wherein the IP addresses are Ipv6
addresses.
5. The method of claim 1, wherein mapping comprises mapping, by the
virtual machine monitor, each of the plurality of virtual memory
addresses within the region of the virtual memory address space of
the virtual machine to a corresponding IP address of the plurality
of internet protocol (IP) addresses on the one-to-one basis.
6. The method of claim 1, wherein mapping comprises mapping by the
virtual machine monitor, multiple ones of the plurality of virtual
memory addresses within the region of the virtual memory address
space of the virtual machine to one of the plurality of internet
protocol (IP) addresses, and wherein mapping further comprises
associating multiple offsets to the mapped IP address for the
multiple ones of the plurality of virtual memory addresses.
7. A method comprising: mapping, by a hypervisor of a computing
device, a plurality of virtual memory addresses within a region of
a virtual memory address space of a virtual machine to a plurality
of internet protocol (IP) addresses, wherein the virtual memory
addresses being mapped correspond to physical memory locations
beyond a range of physical memory addresses available on the
computing device, and wherein the virtual machine is disposed on
the computing device; trapping, by the hypervisor, a memory read or
write access by a guest operating system; determining, by the
hypervisor, whether the memory read or write access occurs for a
memory address that is greater than the range of physical memory
addresses available on the computing device, wherein the memory
read or write access involves one of the plurality of virtual
memory addresses mapped to one of the plurality of IP addresses;
upon determining that the memory read or write access occurs for
the memory address that is greater that the range of physical
memory addresses available on the computing device, forwarding, by
the hypervisor, a data read or write request corresponding to the
memory read or write access to a network device associated with the
one of the plurality of IP addresses corresponding to the one of
the plurality of virtual memory addresses; and upon determining
that the memory read or write access occurs for the memory address
that is within the range of physical memory addresses available on
the computing device, allowing the memory read or write access to
proceed without said forwarding.
8. The method of claim 7, wherein the memory read or write access
facilitates communication between the virtual machine and another
virtual machine disposed on the computing device or another
computing device.
9. The method of claim 8, wherein trapping the memory read or write
access includes utilizing a hardware feature of the computing
device running the hypervisor.
10. The method of claim 9, wherein the hardware feature of the
computing device is a portion of Intel Virtualization
Technology.
11. The method of claim 7, wherein the hypervisor keeps the guest
operating system unaware of whether the memory read or write access
is facilitated by the forwarding of the corresponding data read or
write request.
12. An apparatus comprising: a physical memory configured to store
data; a chipset configured to support a virtual machine monitor
that is configured to: map a plurality of virtual memory addresses
within a region of a virtual memory address space of a virtual
machine to a plurality of network addresses, wherein the plurality
of virtual memory addresses being mapped correspond to physical
memory locations beyond a range of physical memory addresses
available on the physical memory of the apparatus, and wherein the
virtual machine is disposed on the apparatus; trap a memory read or
write access made by a guest operating system; determine that the
memory read or write access occurs for a memory address that is
greater than the range of physical memory addresses available on
the physical memory of the apparatus, wherein the memory read or
write access involves one of the plurality of virtual memory
addresses within said region of the virtual memory address space
that is mapped to one of the plurality of network addresses; and
forward a data read or write request corresponding to the memory
read or write access to a network device associated with the one of
the plurality of network addresses corresponding to the one of the
plurality of the virtual memory addresses; and a network interface
coupled with the chipset and configured to enable the virtual
machine monitor to forward the memory read or write request by
transmitting the memory read or write request to the network device
associated with the one of the plurality of network addresses,
wherein the network device comprises a network memory configured to
store data.
13. The apparatus of claim 12, wherein the virtual machine monitor
is further configured to divide the virtual memory address space
between the physical memory of the apparatus and the network memory
of the network device.
14. The apparatus of claim 13, wherein the virtual machine monitor
is further configured to allow a memory read or write access for a
memory address that is within the range of physical memory
addresses available on the physical memory of the apparatus to
proceed without said transmitting the corresponding data read or
write request to the network device.
15. The apparatus of claim 12, wherein the virtual machine monitor
is a hypervisor.
16. The apparatus of claim 12, wherein the chipset is further
configured to map each of the plurality of virtual memory addresses
within the region of the virtual memory address space of the
virtual machine to a corresponding network address of the plurality
of network addresses on the one-to-one basis.
17. The apparatus of claim 12, wherein the chipset is further
configured to map multiple ones of the plurality of virtual memory
addresses within the region of the virtual memory address space of
the virtual machine to one of the plurality of network addresses,
and associate multiple offsets to the mapped network address for
the multiple ones of the plurality of virtual memory addresses.
18. An article comprising: a tangible non-transitory
computer-readable storage medium having a plurality of machine
accessible instructions, wherein the instructions, in response to
execution of the instructions by a computing device, cause a
virtual machine monitor of the computing device to: map a plurality
of virtual memory addresses within a region of a virtual memory
address space of a virtual machine to a plurality of internet
protocol (IP) addresses, wherein the virtual memory addresses being
mapped correspond to physical memory locations beyond a range of
physical memory addresses available on the computing device, and
wherein the virtual machine is disposed on the computing device;
trap a memory read or write access made by a guest operating
system; determine that the memory read or write access occurs for a
memory address that is greater than the range of physical memory
addresses available on the computing device, wherein the memory
read or write access involves one of the plurality of virtual
memory addresses within the region of the virtual memory address
space that is mapped to one of the plurality of IP addresses; and
transmit a data read or write request corresponding to the memory
read or write access to a network device associated with the one of
the plurality of IP addresses corresponding to the one of the
plurality of virtual memory addresses.
Description
BACKGROUND
1. Field
The present disclosure relates to a technique for memory access,
and specifically to memory access utilizing internet protocol (IP)
addressing semantics.
2. Background Information
Currently, the market is driving larger physical and virtual
address space on commodity hardware as exemplified by the EM64T and
AMD64 extensions to the x86 instruction set. Also, the High
Performance Computing (HPC) community is increasingly moving
towards clusters of commodity systems typically connected via a
high-speed interconnect. Such interconnects may include Infiniband
or Quadrics technology.
Typically, these clusters or distributed computing systems need to
communicate with other systems within and without the cluster.
Often various process within a program running on the cluster need
to communicate or provide data to another process within the
program. Such Inter-Process Communication (IPC) incurs a large
overhead. Unfortunately, there is currently no widely used standard
messaging mechanism for IPC in large systems, beyond massive SMP's
that cost a large amount of money to maintain cache-coherence.
Often scientific HPC applications may be coded utilizing the
Message Passing Interface (MPI) library in order to gain some
degree of portability. However, there is invariably some layer of
software that must bind to the particular interconnection
transport. Therefore, it would be beneficial for HPC deployments to
have a low-latency IPC mechanism. Preferably the mechanism would be
highly portable and available via commodity hardware. It is
understood that, while any such mechanism may be advantageous for
HPC systems, such a mechanism may also be useful to peer-to-peer
gaming and other emergent network use-models.
BRIEF DESCRIPTION OF THE DRAWINGS
Subject matter is particularly pointed out and distinctly claimed
in the concluding portions of the specification. The claimed
subject matter, however, both as to organization and the method of
operation, together with objects, features and advantages thereof,
may be best understood by a reference to the following detailed
description when read with the accompanying drawings in which:
FIG. 1 is a flowchart illustrating an embodiment of a technique for
memory access in accordance with the claimed subject matter;
and
FIG. 2 is a block diagram illustrating an embodiment of a system
and apparatus for memory access in accordance with the claimed
subject matter.
DETAILED DESCRIPTION
In the following detailed description, numerous details are set
forth in order to provide a thorough understanding of the present
claimed subject matter. However, it will be understood by those
skilled in the art that the claimed subject matter may be practiced
without these specific details. In other instances, well-known
methods, procedures, components, and circuits have not been
described in detail so as to not obscure the claimed subject
matter.
In the following detailed description, reference is made to the
accompanying drawings which form a part hereof, and in which is
shown by way of illustration embodiments in which the invention may
be practiced. It is to be understood that other embodiments may be
utilized and structural or logical changes may be made without
departing from the scope of the present invention. Therefore, the
following detailed description is not to be taken in a limiting
sense, and the scope of embodiments in accordance with the present
invention is defined by the appended claims and their
equivalents.
Various operations may be described as multiple discrete operations
in turn, in a manner that may be helpful in understanding
embodiments of the present invention; however, the order of
description should not be construed to imply that these operations
are order dependent.
For the purposes of the description, a phrase in the form "A/B"
means A or B. For the purposes of the description, a phrase in the
form "A and/or B" means "(A), (B), or (A and B)". For the purposes
of the description, a phrase in the form "at least one of A, B, and
C" means "(A), (B), (C), (A and B), (A and C), (B and C), or (A, B
and C)". For the purposes of the description, a phrase in the form
"(A)B" means "(B) or (AB)" that is, A is an optional element. And,
so forth.
For ease of understanding, the description will be in large part
presented in the context of commodity networking; however, the
present invention is not so limited, and may be practiced to
provide more relevant answers to a variety of queries. Reference in
the specification to a network "device" and/or "appliance" means
that a particular feature, structure, or characteristic, namely
device operable connectivity, such as the ability for the device to
be connected to communicate across the network, and/or
programmability, such as the ability for the device to be
configured to perform designated functions, is included in at least
one embodiment of the digital device as used herein. Typically,
digital devices may include general and/or special purpose
computing devices, connected personal computers, network printers,
network attached storage devices, voice over internet protocol
devices, security cameras, baby cameras, media adapters,
entertainment personal computers, and/or other networked devices
suitably configured for practicing the present invention in
accordance with at least one embodiment.
The description may use the phrases "in an embodiment," or "in
embodiments," which may each refer to one or more of the same or
different embodiments. Furthermore, the terms "comprising,"
"including," "having," and the like, as used with respect to
embodiments of the present invention, are synonymous.
The virtualization of machine resources has been of significant
interest for some time; however, with processors becoming more
diverse and complex, such as processors that are deeply
pipelined/super pipelined, hyper-threaded, on-chip multi-processing
capable, and processors having Explicitly Parallel Instruction
Computing (EPIC) architecture, and with larger instruction and data
caches, virtualization of machine resources is becoming an even
greater interest.
Many attempts have been made to make virtualization more efficient.
For example, some vendors offer software products that have a
virtual machine system that permits a machine to be virtualized,
such that the underlying hardware resources of the machine appears
as one or more independently operating virtual machines (VM).
Typically, a Virtual Machine Monitor (VMM, also referred to as a
"Hypervisor") may be a thin layer of software running on a computer
responsible for creating, configuring, and managing VMs. It may
also be responsible for providing isolation between the VMs. In one
embodiment, the VMM may be an application running within a host
operating system. In one specific embodiment, the VMM may include 3
main portions: a kernel mode application or set of applications
running on the host operating system, a set of drivers in the host
operating system, and a co-operative kernel that substantially or
partially replaces the host kernel when the VM is running. In an
alternate embodiment, the VMM may be a layer of basic code
executing directly on the host hardware. Each VM, on the other
hand, may function as a self-contained platform, running its own
operating system (OS), or a copy of the OS, and/or a software
application. Software executing within a VM is collectively
referred to as "guest software" or "guest OS". Some commercial
solutions that provide software VMs include VMware, Inc. (VMware)
of Palo Alto, Calif. and VirtualPC by Microsoft Corp. of Redmond,
Wash.
FIG. 1 is a flowchart illustrating an embodiment of a technique for
memory access in accordance with the claimed subject matter. Block
105 illustrates that, in one embodiment, a system may be started
and basic initialization and configuration may occur. It is
understood that in some embodiments the system may be rebooted or
otherwise reset to a certain point. Also, it is understood that in
various embodiments, various forms of initialization and/or
configuration may or may not occur, including no initialization or
configuration.
Block 110 illustrates that, in one embodiment, a determination may
be made whether or not a system or apparatus is capable of running
a hypervisor (HV). In one embodiment, the determination may also
include whether or not a hypervisor is present and capable of being
run. In other embodiments the presence and ability to run a
hypervisor or substantially equivalent technology may be assumed.
In various embodiments, the determination my not involve a
hypervisor, but instead the ability of a chipset or processor to
support virtualization, such as, for example, the Intel
Virtualization Technology (VT), Advanced MicroDevices
Virtualization (AMD-V), or substantially equivalent
technologies.
Block 115 illustrates that, in one embodiment, if a hypervisor is
present that the hypervisor may be executed. Block 120 illustrates
that, in one embodiment, a determination may be whether or not the
system supports IP address trapping. In one embodiment, the support
may be part of the systems hardware, firmware, software or a
combination thereof. In one particular embodiment, the support may
be a specific function of the hypervisor.
In one specific illustrative embodiment, the system may allow only
2.sup.44 addresses spaces in the physically mapped memory space.
However, the hypervisor may support presenting the guest virtual
machine with a virtual memory space that provides 2.sup.64
addresses. In one specific embodiment, the virtual addresses above
the 2.sup.44 boundary may be mapped to a memory on a network device
via an IP address. As illustrated in more detail below, when a
guest virtual machine accesses a memory location in the mapped
space, the hypervisor may trap the memory request and forward it to
the network device.
In one specific illustrative embodiment, to which the disclosed
subject matter is not limited, a cluster of systems or virtual
machines may be presented with versions of this 2.sup.64 address
space. The area above the 2.sup.44 boundary may be located on a
network device and shared between the various systems in the
cluster. Therefore, this memory space may be conveniently shared
between the devices of the cluster, without explicitly initiating
an inter-process communication request. Inter-process
communication, in this embodiment, may simply occur via a standard
memory access. Of course, in some embodiments, the IP memory space
(illustrate din this embodiment as the memory space above the
2.sup.44 boundary) may be stored on a single or a plurality of
devices. It is also envisioned that this memory may take various
forms, such as, for example, standard RAM, a hard drive, a flash
drive, etc. It is also understood that the disclosed subject matter
is not limited to facilitating inter-process communication and that
that is merely one illustrative embodiment of the disclosed subject
matter.
In one embodiments, a protocol with a large number of network
addresses may be used, such as for example Internet Protocol
version 6 (IPv6). In which case, each IP memory space address may
be associated with a particular network address. In other
embodiments, the overall network address space may be more limited
and the IP memory space may be a region or regions associated with
an network address and offset information may be provided to
properly identify the particular IP memory address accessed. In yet
another embodiment, a combination of the two systems may be used.
It is understood that which the term IP address and network address
are frequently used interchangeably through this document, the
utilization of the Internet Protocol is merely one embodiment of
the disclosed subject matter and other networking protocols are
within the scope of the disclosed subject matter.
Block 125 illustrates the case where IP address trapping is
supported. In one embodiment, the hypervisor may allow trapping of
guest requests to access memory locations in the IP memory address
space. In one embodiment, this may include configuring the guest
operating system or BIOS to facilitate this activity. In another
embodiment, the hypervisor may be able to enable this feature
without configuring the guest operating system.
Block 130 illustrates that the hypervisor may, in one embodiment,
return information to the host system, specifically its firmware
or, in other embodiments, other portions of the host system. Block
135 illustrates that the pre-boot process may be completed on the
host system.
Block 145 illustrates that, in one embodiment, the operating system
may be booted. In one embodiment, this may the operating system of
the host system. In another embodiment the operating system may the
guest operating system within a virtual machine. In one version of
such an embodiment, the host operating system may effectively be
the hypervisor. It is understood that, in one embodiment, various
virtual machine operating machines may boot at any time during this
process, and that parallel embodiments of the remaining
illustrative blocks may exist and be operating at a given time.
Block 145 illustrates that, in one embodiment, a determination may
be made whether or not a memory access has been attempted. Block
150 illustrates that, in one embodiment, if no memory access is
currently being attempted normal processing may continue and the
flowchart may return to Block 145 (illustrated via the path flowing
through diagrammatic Blocks 155 and 155'). Therefore, in one
embodiment, Block 145 may be thought of as a wait state until a
memory access is attempted.
Block 160 illustrates that, in one embodiment, if a memory access
is attempted a determination may be made whether or not the system
supports IP address trapping. In one embodiment, the IP address
space trapping may be supported by a hypervisor as discussed above.
In another embodiment, the IP address space trapping may be
supported utilizing other techniques, such as, for example,
hardware support in the chipset, processor, or other device;
firmware support; or support in the host operating system; however,
these are merely a few illustrative embodiments to which the
disclosed subject matter is not limited.
Block 165 illustrates that, in one embodiment, if IP address
trapping is not supported or not needed the memory access may be
performed. Once accomplished, in one embodiment, the technique
illustrated by FIG. 1 may return to Block 145.
Block 170 illustrates that, in one embodiment, a determination may
be made as to whether or not the memory access is within the IP
address space. In one embodiment, Block 160 and Block 170 may be
combined into a single step. In one embodiment, the determination
may involve determining whether or not the memory access occurs for
a memory address greater than the maximum supported physical
address space. It his embodiment, all IP address space may be
located above a particular memory space boundary. For example, in
the specific illustrative embodiment described above all memory
space above the 2.sup.44 memory address boundary was considered in
the IP address space.
Block 175 illustrates that, in one embodiment, a determination may
be made whether or not the memory access is a read operation or a
write operation. Block 185 illustrates that, in one embodiment, if
the attempted memory access is a read, the network agent or device
responsible for or associated with the network address of the
accessed memory space may be contacted and the desired data may be
requested. In one embodiment, if the desired data is not returned
in a timely fashion, the system may report a memory access error,
or utilize a different failure mechanism. The flowchart may return
to Block 145 (illustrated via the path flowing through diagrammatic
Blocks 155 and 155').
Block 180 illustrates that, in one embodiment, if the attempted
memory access is a write, the network agent or device responsible
for or associated with the network address of the accessed memory
space may be contacted and the desired data may be sent. In one
embodiment, if the desired data is not successfully written in a
timely fashion, the system may report a memory access error, or
utilize a different failure mechanism. In one embodiment, the
system may assume the data has been correctly stored. The flowchart
may return to Block 145 (illustrated via the path flowing through
diagrammatic Block 155').
FIG. 2 is a block diagram illustrating an embodiment of a system
200 and apparatus 201 for memory access in accordance with the
claimed subject matter. In one embodiment, the system may include a
network device 215 and the apparatus 201. In one embodiment, the
network device may include network memory 210. It is envisioned
that in various embodiments this network memory may take various
forms, such as, for example, standard RAM, a hard drive, a flash
drive, etc. In one embodiment, the network device may be capable to
facilitating the mapping of a memory region to the network
device.
In one embodiment, the apparatus 201 may include a memory 260, a
network interface 250, a hypervisor 240, and one or more virtual
machines 230. It is understood that in another embodiment the
apparatus may include an operating system and applications. While
these are not shown, the operating system would replace the
hypervisor 240 and the applications would replace the virtual
machines 230. In one embodiment, the hypervisor and associated
virtual machines may act as an application in the operating
system/application embodiment.
In one embodiment, the memory 260 may be capable of storing data
and/or instructions. In one embodiment, the network interface 250
may be capable of facilitating communication with network device
215. In one embodiment, the hypervisor 240 (or, in another
embodiment, the operating system) may be capable of executing at
least portions of the technique described above and illustrated in
FIG. 1. In one embodiment, the virtual machines 230 (or, in another
embodiment, the applications) may be capable of attempting to
access memory locations either within the memory 260 or the network
memory 210.
The techniques described herein are not limited to any particular
hardware or software configuration; they may find applicability in
any computing or processing environment. The techniques may be
implemented in hardware, software, firmware or a combination
thereof. The techniques may be implemented in programs executing on
programmable machines such as mobile or stationary computers,
personal digital assistants, and similar devices that each include
a processor, a storage medium readable or accessible by the
processor (including volatile and non-volatile memory and/or
storage elements), at least one input device, and one or more
output devices. Program code is applied to the data entered using
the input device to perform the functions described and to generate
output information. The output information may be applied to one or
more output devices.
Each program may be implemented in a high level procedural or
object oriented programming language to communicate with a
processing system. However, programs may be implemented in assembly
or machine language, if desired. In any case, the language may be
compiled or interpreted.
Each such program may be stored on a storage medium or device, e.g.
compact disk read only memory (CD-ROM), digital versatile disk
(DVD), hard disk, firmware, non-volatile memory, magnetic disk or
similar medium or device, that is readable by a general or special
purpose programmable machine for configuring and operating the
machine when the storage medium or device is read by the computer
to perform the procedures described herein. The system may also be
considered to be implemented as a machine-readable or accessible
storage medium, configured with a program, where the storage medium
so configured causes a machine to operate in a specific manner.
Other embodiments are within the scope of the following claims.
While certain features of the claimed subject matter have been
illustrated and described herein, many modifications,
substitutions, changes, and equivalents will now occur to those
skilled in the art. It is, therefore, to be understood that the
appended claims are intended to cover all such modifications and
changes that fall within the true spirit of the claimed subject
matter.
* * * * *