U.S. patent application number 11/929432 was filed with the patent office on 2008-05-29 for memory device with emulated characteristics.
Invention is credited to Suresh Natarajan Rajan, Keith R. Schakel, Michael John Sebastian Smith, David T. Wang, Frederick Daniel Weber.
Application Number | 20080126689 11/929432 |
Document ID | / |
Family ID | 39331369 |
Filed Date | 2008-05-29 |
United States Patent
Application |
20080126689 |
Kind Code |
A1 |
Rajan; Suresh Natarajan ; et
al. |
May 29, 2008 |
MEMORY DEVICE WITH EMULATED CHARACTERISTICS
Abstract
A memory subsystem is provided including an interface circuit
adapted for communication with a system and a majority of address
or control signals of a first number of memory circuits. The
interface circuit includes emulation logic for emulating at least
one memory circuit of a second number.
Inventors: |
Rajan; Suresh Natarajan;
(San Jose, CA) ; Schakel; Keith R.; (San Jose,
CA) ; Smith; Michael John Sebastian; (Palo Alto,
CA) ; Wang; David T.; (San Jose, CA) ; Weber;
Frederick Daniel; (San Jose, CA) |
Correspondence
Address: |
ZILKA-KOTAB, PC- MRM1
P.O. BOX 721120
SAN JOSE
CA
95172-1120
US
|
Family ID: |
39331369 |
Appl. No.: |
11/929432 |
Filed: |
October 30, 2007 |
Related U.S. Patent Documents
|
|
|
|
|
|
Application
Number |
Filing Date |
Patent Number |
|
|
11762010 |
Jun 12, 2007 |
|
|
|
11929432 |
|
|
|
|
11461420 |
Jul 31, 2006 |
|
|
|
11762010 |
|
|
|
|
Current U.S.
Class: |
711/104 ; 703/23;
711/E12.001 |
Current CPC
Class: |
G06F 13/4243 20130101;
G11C 11/40618 20130101; Y02D 10/00 20180101; G11C 7/10 20130101;
Y02D 10/14 20180101; Y02D 10/151 20180101; G11C 11/406 20130101;
G11C 7/1072 20130101 |
Class at
Publication: |
711/104 ; 703/23;
711/E12.001 |
International
Class: |
G06F 9/455 20060101
G06F009/455; G06F 12/00 20060101 G06F012/00 |
Claims
1. A sub-system, comprising: an interface circuit adapted for
communication with a system and a majority of address or control
signals of a first number of memory circuits, the interface circuit
including emulation logic for emulating at least one memory circuit
of a second number.
2. The sub-system of claim 1, wherein the second number is less
than the first number.
3. The sub-system of claim 2, wherein the second number is one.
4. The sub-system of claim 1, wherein the emulation logic emulates
at least one memory circuit with a first memory capacity that is
different than a second memory capacity of at least one of the
plurality of memory circuits.
5. The sub-system of claim 1, wherein the interface circuit is
adapted for communication with all of the address or control
signals of the memory circuits.
6. The sub-system of claim 1, wherein the interface circuit is
adapted for communication with a majority of the address signals of
the memory circuits.
7. The sub-system of claim 1, wherein the interface circuit is
adapted for communication with a majority of the control signals of
the memory circuits.
8. The sub-system of claim 1, wherein the emulation includes an
electrical emulation.
9. The sub-system of claim 1, wherein the emulation includes a
logical emulation.
10. The sub-system of claim 1, wherein the interface circuit
includes a buffer chip.
11. The sub-system of claim 1, wherein the interface circuit is
positioned on a dual in-line memory module (DIMM).
12. The sub-system of claim 11, wherein the DIMM includes a small
outline-DIMM (SO-DIMM).
13. The sub-system of claim 11, wherein the DIMM includes a fully
buffered-DIMM (FB-DIMM).
14. The sub-system of claim 11, wherein the DIMM includes a
registered-DIMM (R-DIMM).
15. The sub-system of claim 1, wherein the memory circuits each
include dynamic random access memory (DRAM).
16. The sub-system of claim 15, wherein the memory circuits each
include a monolithic DRAM.
17. The sub-system of claim 16, wherein the memory circuits are
stacked.
18. The sub-system of claim 16, wherein the memory circuits and the
interface circuit are stacked.
19. A method, comprising: interfacing a majority of address or
control signals of a first number of memory circuits and a system;
and emulating at least one memory circuit of a second number.
20. An apparatus, comprising: a first number of memory circuits;
and an interface circuit in communication with the memory circuits,
the interface circuit including emulation logic for emulating at
least one memory circuit of a second number; wherein the interface
circuit interfaces a majority of address or control signals of the
memory circuits.
Description
RELATED APPLICATION
[0001] This application is a continuation of commonly-assigned U.S.
patent application Ser. No. 11/762,010 entitled "Memory Device with
Emulated Characteristics" filed Jun. 12, 2007 by Rajan, et al.,
which, in turn, is a continuation-in-part of commonly-assigned U.S.
patent application Ser. No. 11/461,420 entitled "System and Method
for Simulating a Different Number of Memory Circuits" filed Jul.
31, 2006 by Rajan, et al., which are incorporated by reference as
if fully set forth herein.
BACKGROUND OF THE INVENTION
[0002] 1. Technical Field of the Invention
[0003] This invention relates generally to digital memory such as
used in computers, and more specifically to organization and design
of memory modules such as DIMMs.
[0004] 2. Background Art
[0005] Digital memories are utilized in a wide variety of
electronic systems, such as personal computers, workstations,
servers, consumer electronics, printers, televisions, and so forth.
Digital memories are manufactured as monolithic integrated circuits
("ICs" or "chips"). Digital memories come in several types, such as
dynamic random access memory (DRAM), static random access memory
(SRAM), flash memory, electrically erasable programmable read only
memory (EEPROM), and so forth.
[0006] In some systems, the memory chips are coupled directly into
the system such as by being soldered directly to the system's main
motherboard. In other systems, groups of memory chips are first
coupled into memory modules, such as dual in-line memory modules
(DIMMs), which are in turn coupled into a system by means of slots,
sockets, or other connectors. Some types of memory modules include
not only the memory chips themselves, but also some additional
logic which interfaces the memory chips to the system. This logic
may perform a variety of low level functions, such as buffering or
latching signals between the chips and the system, but it may also
perform higher level functions, such as telling the system what are
the characteristics of the memory chips. These characteristics may
include, for example, memory capacity, speed, latency, interface
protocol, and so forth.
[0007] Memory capacity requirements of such systems are increasing
rapidly. However, other industry trends such as higher memory bus
speeds, small form factor machines, etc. are reducing the number of
memory module slots, sockets, connectors, etc. that are available
in such systems. There is, therefore, pressure for manufacturers to
use large capacity memory modules in such systems.
[0008] However, there is also an exponential relationship between a
memory chip's capacity and its price. As a result, large capacity
memory modules may be cost prohibitive in some systems.
[0009] What is needed, then, is an effective way to make use of low
cost memory chips in manufacturing high capacity memory
modules.
SUMMARY
[0010] A memory subsystem is provided including an interface
circuit adapted for communication with a system and a majority of
address or control signals of a first number of memory circuits.
The interface circuit includes emulation logic for emulating at
least one memory circuit of a second number.
BRIEF DESCRIPTION OF THE DRAWINGS
[0011] FIG. 1 shows a system coupled to multiple memory circuits
and an interface circuit according to one embodiment of this
invention.
[0012] FIG. 2 shows a buffered stack of DRAM circuits each having a
dedicated data path from the buffer chip and sharing a single
address, control, and clock bus.
[0013] FIG. 3 shows a buffered stack of DRAM circuits having two
address, control, and clock busses and two data busses.
[0014] FIG. 4 shows a buffered stack of DRAM circuits having one
address, control, and clock bus and two data busses.
[0015] FIG. 5 shows a buffered stack of DRAM circuits having one
address, control, and clock bus and one data bus.
[0016] FIG. 6 shows a buffered stack of DRAM circuits in which the
buffer chip is located in the middle of the stack of DRAM
chips.
[0017] FIG. 7 is a flow chart showing one method of storing
information.
[0018] FIG. 8 shows a high capacity DIMM using buffered stacks of
DRAM chips according to one embodiment of this invention.
[0019] FIG. 9 is a timing diagram showing one embodiment of how the
buffer chip makes a buffered stack of DRAM circuits appear to the
system or memory controller to use longer column address strobe
(CAS) latency DRAM chips than is actually used by the physical DRAM
chips.
[0020] FIG. 10 shows a timing diagram showing the write data timing
expected by DRAM in a buffered stack, in accordance with another
embodiment of this invention.
[0021] FIG. 11 is a timing diagram showing how write control
signals are delayed by a buffer chip in accordance with another
embodiment of this invention.
[0022] FIG. 12 is a timing diagram showing early write data from a
memory controller or an advanced memory buffer (AMB) according to
yet another embodiment of this invention.
[0023] FIG. 13 is a timing diagram showing address bus conflicts
caused by delayed write operations.
[0024] FIG. 14 is a timing diagram showing variable delay of an
activate operation through a buffer chip.
[0025] FIG. 15 is a timing diagram showing variable delay of a
precharge operation through a buffer chip.
[0026] FIG. 16 shows a buffered stack of DRAM circuits and the
buffer chip which presents them to the system as if they were a
single, larger DRAM circuit, in accordance with one embodiment of
this invention.
[0027] FIG. 17 is a flow chart showing a method of refreshing a
plurality of memory circuits, in accordance with one embodiment of
this invention.
[0028] FIG. 18 shows a block diagram of another embodiment of the
invention.
DETAILED DESCRIPTION
[0029] The invention will be understood more fully from the
detailed description given below and from the accompanying drawings
of embodiments of the invention which, however, should not be taken
to limit the invention to the specific embodiments described, but
are for explanation and understanding only.
[0030] FIG. 1 illustrates a system 100 including a system device
106 coupled to an interface circuit 102, which is in turn coupled
to a plurality of physical memory circuits 104A-N. The physical
memory circuits may be any type of memory circuits. In some
embodiments, each physical memory circuit is a separate memory
chip. For example, each may be a DDR2 DRAM. In some embodiments,
the memory circuits may be symmetrical, meaning each has the same
capacity, type, speed, etc., while in other embodiments they may be
asymmetrical. For ease of illustration only, three such memory
circuits are shown, but actual embodiments may use any plural
number of memory circuits. As will be discussed below, the memory
chips may optionally be coupled to a memory module (not shown),
such as a DIMM.
[0031] The system device may be any type of system capable of
requesting and/or initiating a process that results in an access of
the memory circuits. The system may include a memory controller
(not shown) through which it accesses the memory circuits.
[0032] The interface circuit may include any circuit or logic
capable of directly or indirectly communicating with the memory
circuits, such as a buffer chip, advanced memory buffer (AMB) chip,
etc. The interface circuit interfaces a plurality of signals 108
between the system device and the memory circuits. Such signals may
include, for example, data signals, address signals, control
signals, clock signals, and so forth. In some embodiments, all of
the signals communicated between the system device and the memory
circuits are communicated via the interface circuit. In other
embodiments, some other signals 110 are communicated directly
between the system device (or some component thereof, such as a
memory controller, an AMB, or a register) and the memory circuits,
without passing through the interface circuit. In some such
embodiments, the majority of signals are communicated via the
interface circuit, such that L>M.
[0033] As will be explained in greater detail below, the interface
circuit presents to the system device an interface to emulated
memory devices which differ in some aspect from the physical memory
circuits which are actually present. For example, the interface
circuit may tell the system device that the number of emulated
memory circuits is different than the actual number of physical
memory circuits. The terms "emulating", "emulated", "emulation",
and the like will be used in this disclosure to signify emulation,
simulation, disguising, transforming, converting, and the like,
which results in at least one characteristic of the memory circuits
appearing to the system device to be different than the actual,
physical characteristic. In some embodiments, the emulated
characteristic may be electrical in nature, physical in nature,
logical in nature, pertaining to a protocol, etc. An example of an
emulated electrical characteristic might be a signal, or a voltage
level. An example of an emulated physical characteristic might be a
number of pins or wires, a number of signals, or a memory capacity.
An example of an emulated protocol characteristic might be a
timing, or a specific protocol such as DDR3.
[0034] In the case of an emulated signal, such signal may be a
control signal such as an address signal, a data signal, or a
control signal associated with an activate operation, precharge
operation, write operation, mode register read operation, refresh
operation, etc. The interface circuit may emulate the number of
signals, type of signals, duration of signal assertion, and so
forth. It may combine multiple signals to emulate another
signal.
[0035] The interface circuit may present to the system device an
emulated interface to e.g. DDR3 memory, while the physical memory
chips are, in fact, DDR2 memory. The interface circuit may emulate
an interface to one version of a protocol such as DDR2 with 5-5-5
latency timing, while the physical memory chips are built to
another version of the protocol such as DDR2 with 3-3-3 latency
timing. The interface circuit may emulate an interface to a memory
having a first capacity that is different than the actual combined
capacity of the physical memory chips.
[0036] An emulated timing may relate to latency of e.g. a column
address strobe (CAS) latency, a row address to column address
latency (tRCD), a row precharge latency (tRP), an activate to
precharge latency (tRAS), and so forth. CAS latency is related to
the timing of accessing a column of data. tRCD is the latency
required between the row address strobe (RAS) and CAS. tRP is the
latency required to terminate an open row and open access to the
next row. tRAS is the latency required to access a certain row of
data between an activate operation and a precharge operation.
[0037] The interface circuit may be operable to receive a signal
from the system device and communicate the signal to one or more of
the memory circuits after a delay (which may be hidden from the
system device). Such delay may be fixed, or in some embodiments it
may be variable. If variable, the delay may depend on e.g. a
function of the current signal or a previous signal, a combination
of signals, or the like. The delay may include a cumulative delay
associated with any one or more of the signals. The delay may
result in a time shift of the signal forward or backward in time
with respect to other signals. Different delays may be applied to
different signals. The interface circuit may similarly be operable
to receive a signal from a memory circuit and communicate the
signal to the system device after a delay.
[0038] The interface circuit may take the form of, or incorporate,
or be incorporated into, a register, an AMB, a buffer, or the like,
and may comply with Joint Electron Device Engineering Council
(JEDEC) standards, and may have forwarding, storing, and/or
buffering capabilities.
[0039] In some embodiments, the interface circuit may perform
operations without the system device's knowledge. One particularly
useful such operation is a power-saving operation. The interface
circuit may identify one or more of the memory circuits which are
not currently being accessed by the system device, and perform the
power saving operation on those. In one such embodiment, the
identification may involve determining whether any page (or other
portion) of memory is being accessed. The power saving operation
may be a power down operation, such as a precharge power down
operation.
[0040] The interface circuit may include one or more devices which
together perform the emulation and related operations. The
interface circuit may be coupled or packaged with the memory
devices, or with the system device or a component thereof, or
separately. In one embodiment, the memory circuits and the
interface circuit are coupled to a DIMM.
[0041] FIG. 2 illustrates one embodiment of a system 200 including
a system device (e.g. host system 204, etc.) which communicates
address, control, clock, and data signals with a memory subsystem
201 via an interface.
[0042] The memory subsystem includes a buffer chip 202 which
presents the host system with emulated interface to emulated
memory, and a plurality of physical memory circuits which, in the
example shown, are DRAM chips 206A-D. In one embodiment, the DRAM
chips are stacked, and the buffer chip is placed electrically
between them and the host system. Although the embodiments
described here show the stack consisting of multiple DRAM circuits,
a stack may refer to any collection of memory circuits (e.g. DRAM
circuits, flash memory circuits, or combinations of memory circuit
technologies, etc.).
[0043] The buffer chip buffers communicates signals between the
host system and the DRAM chips, and presents to the host system an
emulated interface to present the memory as though it were a
smaller number of larger capacity DRAM chips, although in actuality
there is a larger number of smaller capacity DRAM chips in the
memory subsystem. For example, there may be eight 512 Mb physical
DRAM chips, but the buffer chip buffers and emulates them to appear
as a single 4 Gb DRAM chip, or as two 2 Gb DRAM chips. Although the
drawing shows four DRAM chips, this is for ease of illustration
only; the invention is, of course, not limited to using four DRAM
chips.
[0044] In the example shown, the buffer chip is coupled to send
address, control, and clock signals 208 to the DRAM chips via a
single, shared address, control, and clock bus, but each DRAM chip
has its own, dedicated data path for sending and receiving data
signals 210 to/from the buffer chip.
[0045] Throughout this disclosure, the reference number I will be
used to denote the interface between the host system and the buffer
chip, the reference number 2 will be used to denote the address,
control, and clock interface between the buffer chip and the
physical memory circuits, and the reference number 3 will be used
to denote the data interface between the buffer chip and the
physical memory circuits, regardless of the specifics of how any of
those interfaces is implemented in the various embodiments and
configurations described below. In the configuration shown in FIG.
2, there is a single address, control, and clock interface channel
2 and four data interface channels 3; this implementation may thus
be said to have a "1A4D" configuration (wherein "1A" means one
address, control, and clock channel in interface 2, and "4D" means
four data channels in interface 3).
[0046] In the example shown, the DRAM chips are physically arranged
on a single side of the buffer chip. The buffer chip may,
optionally, be a part of the stack of DRAM chips, and may
optionally be the bottommost chip in the stack. Or, it may be
separate from the stack.
[0047] FIG. 3 illustrates another embodiment of a system 301 in
which the buffer chip 303 is interfaced to a host system 304 and is
coupled to the DRAM chips 307A-307D somewhat differently than in
the system of FIG. 2. There are a plurality of shared address,
control, and clock busses 309A and 309B, and a plurality of shared
data busses 305A and 305B. Each shared bus has two or more DRAM
chips coupled to it. As shown, the sharing need not necessarily be
the same in the data busses as it is in the address, control, and
clock busses. This embodiment has a "2A2D" configuration.
[0048] FIG. 4 illustrates another embodiment of a system 411 in
which the buffer chip 413 is interfaced to a host system 404 and is
coupled to the DRAM chips 417A-417D somewhat differently than in
the system of FIG. 2 or 3. There is a shared address, control, and
clock bus 419, and a plurality of shared data busses 415A and 415B.
Each shared bus has two or more DRAM chips coupled to it. This
implementation has a "1A2D" configuration.
[0049] FIG. 5 illustrates another embodiment of a system 521 in
which the buffer chip 523 is interfaced to a host system 504 and is
coupled to the DRAM chips 527A-527D somewhat differently than in
the system of FIGS. 2 through 4. There is a shared address,
control, and clock bus 529, and a shared data bus 525. This
implementation has a "1A1D" configuration.
[0050] FIG. 6 illustrates another embodiment of a system 631 in
which the buffer chip 633 is interfaced to a host system 604 and is
coupled to the DRAM chips 637A-637D somewhat differently than in
the system of FIGS. 2 through 5. There is a plurality of shared
address, control, and clock busses 639A and 639B, and a plurality
of dedicated data paths 635. Each shared bus has two or more DRAM
chips coupled to it. Further, in the example shown, the DRAM chips
are physically arranged on both sides of the buffer chip. There may
be, for example, sixteen DRAM chips, with the eight DRAM chips on
each side of the buffer chip arranged in two stacks of four chips
each. This implementation has a "2A4D" configuration.
[0051] FIGS. 2 through 6 are not intended to be an exhaustive
listing of all possible permutations of data paths, busses, and
buffer chip configurations, and are only illustrative of some ways
in which the host system device can be in electrical contact only
with the load of the buffer chip and thereby be isolated from
whatever physical memory circuits, data paths, busses, etc. exist
on the (logical) other side of the buffer chip.
[0052] FIG. 7 illustrates one embodiment of a method 700 for
storing at least a portion of information received in association
with a first operation, for use in performing a second operation.
Such a method may be practiced in a variety of systems, such as,
but not limited to, those of FIGS. 1-6. For example, the method may
be performed by the interface circuit of FIG. 1 or the buffer chip
of FIG. 2.
[0053] Initially, first information is received (702) in
association with a first operation to be performed on at least one
of the memory circuits (DRAM chips). Depending on the particular
implementation, the first information may be received prior to,
simultaneously with, or subsequent to the instigation of the first
operation. The first operation may be, for example, a row
operation, in which case the first information may include e.g.
address values received by the buffer chip via the address bus from
the host system. At least a portion of the first information is
then stored (704).
[0054] The buffer chip also receives (706) second information
associated with a second operation. For convenience, this receipt
is shown as being after the storing of the first information, but
it could also happen prior to or simultaneously with the storing.
The second operation may be, for example, a column operation.
[0055] Then, the buffer chip performs (708) the second operation,
utilizing the stored portion of the first information, and the
second information.
[0056] If the buffer chip is emulating a memory device which has a
larger capacity than each of the physical DRAM chips in the stack,
the buffer chip may receive from the host system's memory
controller more address bits than are required to address any given
one of the DRAM chips. In this instance, the extra address bits may
be decoded by the buffer chip to individually select the DRAM
chips, utilizing separate chip select signals (not shown) to each
of the DRAM chips in the stack.
[0057] For example, a stack of four .times.4 1 Gb DRAM chips behind
the buffer chip may appear to the host system as a single .times.4
4 Gb DRAM circuit, in which case the memory controller may provide
sixteen row address bits and three bank address bits during a row
operation (e.g. an activate operation), and provide eleven column
address bits and three bank address bits during a column operation
(e.g. a read or write operation). However, the individual DRAM
chips in the stack may require only fourteen row address bits and
three bank address bits for a row operation, and eleven column
address bits and three bank address bits during a column operation.
As a result, during a row operation (the first operation in the
method 702), the buffer chip may receive two address bits more than
are needed by any of the DRAM chips. The buffer chip stores (704)
these two extra bits during the row operation (in addition to using
them to select the correct one of the DRAM chips), then uses them
later, during the column operation, to select the correct one of
the DRAM chips.
[0058] The mapping between a system address (from the host system
to the buffer chip) and a device address (from the buffer chip to a
DRAM chip) may be performed in various manners. In one embodiment,
lower order system row address and bank address bits may be mapped
directly to the device row address and bank address bits, with the
most significant system row address bits (and, optionally, the most
significant bank address bits) being stored for use in the
subsequent column operation. In one such embodiment, what is stored
is the decoded version of those bits; in other words, the extra
bits may be stored either prior to or after decoding. The stored
bits may be stored, for example, in an internal lookup table (not
shown) in the buffer chip, for one or more clock cycles.
[0059] As another example, the buffer chip may have four 512 Mb
DRAM chips with which it emulates a single 2 Gb DRAM chip. The
system will present fifteen row address bits, from which the buffer
chip may use the fourteen low order bits (or, optionally, some
other set of fourteen bits) to directly address the DRAM chips. The
system will present three bank address bits, from which the buffer
chip may use the two low order bits (or, optionally, some other set
of two bits) to directly address the DRAM chips. During a row
operation, the most significant bank address bit (or other unused
bit) and the most significant row address bit (or other unused bit)
are used to generate the four DRAM chip select signals, and are
stored for later reuse. And during a subsequent column operation,
the stored bits are again used to generate the four DRAM chip
select signals. Optionally, the unused bank address is not stored
during the row operation, as it will be re-presented during the
subsequent column operation.
[0060] As yet another example, addresses may be mapped between four
1 Gb DRAM circuits to emulate a single 4 Gb DRAM circuit. Sixteen
row address bits and three bank address bits come from the host
system, of which the low order fourteen address bits and all three
bank address bits are mapped directly to the DRAM circuits. During
a row operation, the two most significant row address bits are
decoded to generate four chip select signals, and are stored using
the bank address bits as the index. During the subsequent column
operation, the stored row address bits are again used to generate
the four chip select signals.
[0061] A particular mapping technique may be chosen, to ensure that
there are no unnecessary combinational logic circuits in the
critical timing path between the address input pins and address
output pins of the buffer chip. Corresponding combinational logic
circuits may instead be used to generate the individual chip select
signals. This may allow the capacitive loading on the address
outputs of the buffer chip to be much higher than the loading on
the individual chip select signal outputs of the buffer chip.
[0062] In another embodiment, the address mapping may be performed
by the buffer chip using some of the bank address signals from the
host system to generate the chip select signals. The buffer chip
may store the higher order row address bits during a row operation,
using the bank address as the index, and then use the stored
address bits as part of the DRAM circuit bank address during a
column operation.
[0063] For example, four 512 Mb DRAM chips may be used in emulating
a single 2 Gb DRAM. Fifteen row address bits come from the host
system, of which the low order fourteen are mapped directly to the
DRAM chips. Three bank address bits come from the host system, of
which the least significant bit is used as a DRAM circuit bank
address bit for the DRAM chips. The most significant row address
bit may be used as an additional DRAM circuit bank address bit.
During a row operation, the two most significant bank address bits
are decoded to generate the four chip select signals. The most
significant row address bit may be stored during the row operation,
and reused during the column operation with the least significant
bank address bit, to form the DRAM circuit bank address.
[0064] The column address from the host system memory controller
may be mapped directly as the column address to the DRAM chips in
the stack, since each of the DRAM chips may have the same page
size, regardless any differences in the capacities of the
(asymmetrical) DRAM chips.
[0065] Optionally, address bit A(10) may be used by the memory
controller to enable or disable auto-precharge during a column
operation, in which case the buffer chip may forward that bit to
the DRAM circuits without any modification during a column
operation.
[0066] In various embodiments, it may be desirable to determine
whether the simulated DRAM circuit behaves according to a desired
DRAM standard or other design specification. Behavior of many DRAM
circuits is specified by the JEDEC standards, and it may be
desirable to exactly emulate a particular JEDEC standard DRAM. The
JEDEC standard defines control signals that a DRAM circuit must
accept and the behavior of the DRAM circuit as a result of such
control signals. For example, the JEDEC specification for DDR2 DRAM
is known as JESD79-2B. If it is desired to determine whether a
standard is met, the following algorithm may be used. Using a set
of software verification tools, it checks for formal verification
of logic, that protocol behavior of the simulated DRAM circuit is
the same as the desired standard or other design specification.
Examples of suitable verification tools include: Magellan, supplied
by Synopsys, Inc. of 700 E. Middlefield Rd., Mt. View, Calif.
94043; Incisive, supplied by Cadence Design Systems, Inc., of 2655
Sealy Ave., San Jose, Calif. 95134; tools supplied by Jasper Design
Automation, Inc. of 100 View St. #100, Mt. View, Calif. 94041;
Verix, supplied by Real Intent, Inc., of 505 N. Mathilda Ave. #210,
Sunnyvale, Calif. 94085; 0-In, supplied by Mentor Graphics Corp. of
8005 SW Boeckman Rd., Wilsonville, Oreg. 97070; and others. These
software verification tools use written assertions that correspond
to the rules established by the particular DRAM protocol and
specification. These written assertions are further included in the
code that forms the logic description for the buffer chip. By
writing assertions that correspond to the desired behavior of the
emulated DRAM circuit, a proof may be constructed that determines
whether the desired design requirements are met.
[0067] For instance, an assertion may be written that no two DRAM
control signals are allowed to be issued to an address, control,
and clock bus at the same time. Although one may know which of the
various buffer chip/DRAM stack configurations and address mappings
(such as those described above) are suitable, the verification
process allows a designer to prove that the emulated DRAM circuit
exactly meets the required standard etc. If, for example, an
address mapping that uses a common bus for data and a common bus
for address, results in a control and clock bus that does not meet
a required specification, alternative designs for buffer chips with
other bus arrangements or alternative designs for the sideband
signal interconnect between two or more buffer chips may be used
and tested for compliance. Such sideband signals convey the power
management signals, for example.
[0068] FIG. 8 illustrates a high capacity DIMM 800 using a
plurality of buffered stacks of DRAM circuits 802 and a register
device 804, according to one embodiment of this invention. The
register performs the addressing and control of the buffered
stacks. In some embodiments, the DIMM may be an FB-DIMM, in which
case the register is an AMB. In one embodiment the emulation is
performed at the DIMM level.
[0069] FIG. 9 is a timing diagram illustrating a timing design 900
of a buffer chip which makes a buffered stack of DRAM chips mimic a
larger DRAM circuit having longer CAS latency, in accordance with
another embodiment of this invention. Any delay through a buffer
chip may be made transparent to the host system's memory
controller, by using such a method. Such a delay may be a result of
the buffer chip being located electrically between the memory bus
of the host system and the stacked DRAM circuits, since some or all
of the signals that connect the memory bus to the DRAM circuits
pass through the buffer chip. A finite amount of time may be needed
for these signals to traverse through the buffer chip. With the
exception of register chips and AMBs, industry standard memory
protocols may not comprehend the buffer chip that sits between the
memory bus and the DRAM chips. Industry standards narrowly define
the properties of a register chip and an AMB, but not the
properties of the buffer chip of this embodiment. Thus, any signal
delay caused by the buffer chip may cause a violation of the
industry standard protocols.
[0070] In one embodiment, the buffer chip may cause a one-half
clock cycle delay between the buffer chip receiving address and
control signals from the host system memory controller (or,
optionally, from a register chip or an AMB), and the address and
control signals being valid at the inputs of the stacked DRAM
circuits. Data signals may also have a one-half clock cycle delay
in either direction to/from the host system. Other amounts of delay
are, of course, possible, and the half-clock cycle example is for
illustration only.
[0071] The cumulative delay through the buffer chip is the sum of a
delay of the address and control signals and a delay of the data
signals. FIG. 9 illustrates an example where the buffer chip is
using DRAM chips having a native CAS latency of i clocks, and the
buffer chip delay is j clocks, thus the buffer chip emulates a DRAM
having a CAS latency of i+j clocks. In the example shown, the DRAM
chips have a native CAS latency 906 of four clocks (from t1 to t5),
and the total latency through the buffer chip is two clocks (one
clock delay 902 from t0 to t1 for address and control signals, plus
one clock delay 904 from t5 to t6 for data signals), and the buffer
chip emulates a DRAM having a six clock CAS latency 908.
[0072] In FIG. 9 (and other timing diagrams), the reference numbers
1, 2, and/or 3 at the left margin indicate which of the interfaces
correspond to the signals or values illustrated on the associated
waveforms. For example, in FIG. 9: the "Clock" signal shown as a
square wave on the uppermost waveform is indicated as belonging to
the interface 1 between the host system and the buffer chip; the
"Control Input to Buffer" signal is also part of the interface 1;
the "Control Input to DRAM" waveform is part of the interface 2
from the buffer chip to the physical memory circuits; the "Data
Output from DRAM" waveform is part of the interface 3 from the
physical memory circuits to the buffer chip; and the "Data Output
from Buffer" shown in the lowermost waveform is part of the
interface 1 from the buffer chip to the host system.
[0073] FIG. 10 is a timing diagram illustrating a timing design
1000 of write data timing expected by a DRAM circuit in a buffered
stack. Emulation of a larger capacity DRAM circuit having higher
CAS latency (as in FIG. 9) may, in some implementations, create a
problem with the timing of write operations. For example, with
respect to a buffered stack of DDR2 SDRAM chips with a read CAS
latency of four clocks which are used in emulating a single larger
DDR2 SDRAM with a read CAS latency of six clocks, the DDR2 SDRAM
protocol may specify that the write CAS latency 1002 is one less
than the read CAS latency. Therefore, since the buffered stack
appears as a DDR2 SDRAM with a read CAS latency of six clocks, the
memory controller may use a buffered stack write CAS latency of
five clocks 1004 when scheduling a write operation to the
memory.
[0074] In the specific example shown, the memory controller issues
the write operation at t0. After a one clock cycle delay through
the buffer chip, the write operation is issued to the DRAM chips at
t1. Because the memory controller believes it is connected to
memory having a read CAS latency of six clocks and thus a write CAS
latency of five clocks, it issues the write data at time t0+5=t5.
But because the physical DRAM chips have a read CAS latency of four
clocks and thus a write CAS latency of three clocks, they expect to
receive the write data at time t1+3=t4. Hence the problem, which
the buffer chip may alleviate by delaying write operations.
[0075] The waveform "Write Data Expected by DRAM" is not shown as
belonging to interface 1, interface 2, or interface 3, for the
simple reason that there is no such signal present in any of those
interfaces. That waveform represents only what is expected by the
DRAM, not what is actually provided to the DRAM.
[0076] FIG. 11 is a timing illustrating a timing design 1100
showing how the buffer chip does this. The memory controller issues
the write operation at t0. In FIG. 10, the write operation appeared
at the DRAM circuits one clock later at t1, due to the inherent
delay through the buffer chip. But in FIG. 11, in addition to the
inherent one clock delay, the buffer chip has added an extra two
clocks of delay to the write operation, which is not issued to the
DRAM chips until t0+1+2=t3. Because the DRAM chips receive the
write operation at t3 and have a write CAS latency of three clocks,
they expect to receive the write data at t3+3=t6. Because the
memory controller issued the write operation at t0, and it expects
a write CAS latency of five clocks, it issues the write data at
time t0+5=t5. After a one clock delay through the buffer chip, the
write data arrives at the DRAM chips at t5+1=t6, and the timing
problem is solved.
[0077] It should be noted that extra delay of j clocks (beyond the
inherent delay) which the buffer chip deliberately adds before
issuing the write operation to the DRAM is the sum j clocks of the
inherent delay of the address and control signals and the inherent
delay of the data signals. In the example shown, both those
inherent delays are one clock, so j=2.
[0078] FIG. 12 is a timing diagram illustrating operation of an
FB-DIMM's AMB, which may be designed to send write data earlier to
buffered stacks instead of delaying the write address and operation
(as in FIG. 11). Specifically, it may use an early write CAS
latency 1202 to compensate the timing of the buffer chip write
operation. If the buffer chip has a cumulative (address and data)
inherent delay of two clocks, the AMB may send the write data to
the buffered stack two clocks early. This may not be possible in
the case of registered DIMMs, in which the memory controller sends
the write data directly to the buffered stacks (rather than via the
AMB). In another embodiment, the memory controller itself could be
designed to send write data early, to compensate for the j clocks
of cumulative inherent delay caused by the buffer chip.
[0079] In the example shown, the memory controller issues the write
operation at t0. After a one clock inherent delay through the
buffer chip, the write operation arrives at the DRAM at t1. The
DRAM expects the write data at t1+3=t4. The industry specification
would suggest a nominal write data time of t0+5=t5, but the AMB (or
memory controller), which already has the write data (which are
provided with the write operation), is configured to perform an
early write at t5-2=t3. After the inherent delay 1203 through the
buffer chip, the write data arrive at the DRAM at t3+1=t4, exactly
when the DRAM expects it--specifically, with a three-cycle DRAM
Write CAS latency 1204 which is equal to the three-cycle Early
Write CAS Latency 1202.
[0080] FIG. 13 is a timing diagram 1300 illustrating bus conflicts
which can be caused by delayed write operations. The delaying of
write addresses and write operations may be performed by a buffer
chip, a register, an AMB, etc. in a manner that is completely
transparent to the memory controller of the host system. And,
because the memory controller is unaware of this delay, it may
schedule subsequent operations such as activate or precharge
operations, which may collide with the delayed writes on the
address bus to the DRAM chips in the stack.
[0081] An example is shown, in which the memory controller issues a
write operation 1302 at time t0. The buffer chip or AMB delays the
write operation, such that it appears on the bus to the DRAM chips
at time t3. Unfortunately, at time t2 the memory controller issued
an activate operation (control signal) 1304 which, after a
one-clock inherent delay through the buffer chip, appears on the
bus to the DRAM chips at time t3, colliding with the delayed
write.
[0082] FIGS. 14 and 15 are a timing diagram 1400 and a timing
diagram 1500 illustrating methods of avoiding such collisions. If
the cumulative latency through the buffer chip is two clock cycles,
and the native read CAS latency of the DRAM chips is four clock
cycles, then in order to hide the delay of the address and control
signals and the data signals through the buffer chip, the buffer
chip presents the host system with an interface to an emulated
memory having a read CAS latency of six clock cycles. And if the
tRCD and tRP of the DRAM chips are four clock cycles each, the
buffer chip tells the host system that they are six clock cycles
each in order to allow the buffer chip to delay the activate and
precharge operations to avoid collisions in a manner that is
transparent to the host system.
[0083] For example, a buffered stack that uses 4-4-4 DRAM chips
(that is, CAS latency=4, tRCD=4, and tRP=4) may appear to the host
system as one larger DRAM that uses 6-6-6 timing.
[0084] Since the buffered stack appears to the host system's memory
controller as having a tRCD of six clock cycles, the memory
controller may schedule a column operation to a bank six clock
cycles (at time t6) after an activate (row) operation (at time t0)
to the same bank. However, the DRAM chips in the stack actually
have a tRCD of four clock cycles. This gives the buffer chip time
to delay the activate operation by up to two clock cycles, avoiding
any conflicts on the address bus between the buffer chip and the
DRAM chips, while ensuring correct read and write timing on the
channel between the memory controller and the buffered stack.
[0085] As shown, the buffer chip may issue the activate operation
to the DRAM chips one, two, or three clock cycles after it receives
the activate operation from the memory controller, register, or
AMB. The actual delay selected may depend on the presence or
absence of other DRAM operations that may conflict with the
activate operation, and may optionally change from one activate
operation to another. In other words, the delay may be dynamic. A
one-clock delay (1402A, 1502A) may be accomplished simply by the
inherent delay through the buffer chip. A two-clock delay (1402B,
1502B) may be accomplished by adding one clock of additional delay
to the one-clock inherent delay, and a three-clock delay (1402C,
1502C) may be accomplished by adding two clocks of additional delay
to the one-clock inherent delay. A read, write, or activate
operation issued by the memory controller at time t6 will, after a
one-clock inherent delay through the buffer chip, be issued to the
DRAM chips at time t7. A preceding activate or precharge operation
issued by the memory controller at time t0 will, depending upon the
delay, be issued to the DRAM chips at time t1, t2, or t3, each of
which is at least the tRCD or tRP of four clocks earlier than the
t7 issuance of the read, write, or activate operation.
[0086] Since the buffered stack appears to the memory controller to
have a tRP of six clock cycles, the memory controller may schedule
a subsequent activate (row) operation to a bank a minimum of six
clock cycles after issuing a precharge operation to that bank.
However, since the DRAM circuits in the stack actually have a tRP
of four clock cycles, the buffer chip may have the ability to delay
issuing the precharge operation to the DRAM chips by up to two
clock cycles, in order to avoid any conflicts on the address bus,
or in order to satisfy the tRAS requirements of the DRAM chips.
[0087] In particular, if the activate operation to a bank was
delayed to avoid an address bus conflict, then the precharge
operation to the same bank may be delayed by the buffer chip to
satisfy the tRAS requirements of the DRAM. The buffer chip may
issue the precharge operation to the DRAM chips one, two, or three
clock cycles after it is received. The delay selected may depend on
the presence or absence of address bus conflicts or tRAS
violations, and may change from one precharge operation to
another.
[0088] FIG. 16 illustrates a buffered stack 1600 according to one
embodiment of this invention. The buffered stack includes four 512
Mb DDR2 DRAM circuits (chips) 1602 which a buffer chip 1604 maps to
a single 2 Gb DDR2 DRAM.
[0089] Although the multiple DRAM chips appear to the memory
controller as though they were a single, larger DRAM, the combined
power dissipation of the actual DRAM chips may be much higher than
the power dissipation of a monolithic DRAM of the same capacity. In
other words, the physical DRAM may consume significantly more power
than would be consumed by the emulated DRAM.
[0090] As a result, a DIM M containing multiple buffered stacks may
dissipate much more power than a standard DIMM of the same actual
capacity using monolithic DRAM circuits. This increased power
dissipation may limit the widespread adoption of DIMMs that use
buffered stacks. Thus, it is desirable to have a power management
technique which reduces the power dissipation of DIMMs that use
buffered stacks.
[0091] In one such technique, the DRAM circuits may be
opportunistically placed in low power states or modes. For example,
the DRAM circuits may be placed in a precharge power down mode
using the clock enable (CKE) pin of the DRAM circuits.
[0092] A single rank registered DIMM (R-DIMM) may contain a
plurality of buffered stacks, each including four .times.4 512Mb
DDR2 SDRAM chips and appear (to the memory controller via emulation
by the buffer chip) as a single .times.4 2 Gb DDR2 SDRAM. The JEDEC
standard indicates that a 2 Gb DDR2 SDRAM may generally have eight
banks, shown in FIG. 16 as Bank 0 to Bank 7. Therefore, the buffer
chip may map each 512 Mb DRAM chip in the stack to two banks of the
equivalent 2 Gb DRAM, as shown; the first DRAM chip 1602A is
treated as containing banks 0 and 1, 1602B is treated as containing
banks 2 and 4, and so forth.
[0093] The memory controller may open and close pages in the DRAM
banks based on memory requests it receives from the rest of the
host system. In some embodiments, no more than one page may be able
to be open in a bank at any given time. In the embodiment shown in
FIG. 16, each DRAM chip may therefore have up to two pages open at
a time. When a DRAM chip has no open pages, the power management
scheme may place it in the precharge power down mode.
[0094] The clock enable inputs of the DRAM chips may be controlled
by the buffer chip, or by another chip (not shown) on the R-DIMM,
or by an AMB (not shown) in the case of an FB-DIMM, or by the
memory controller, to implement the power management technique. The
power management technique may be particularly effective if it
implements a closed page policy.
[0095] Another optional power management technique may include
mapping a plurality of DRAM circuits to a single bank of the larger
capacity emulated DRAM. For example, a buffered stack (not shown)
of sixteen .times.4 256 Mb DDR2 SDRAM chips may be used in
emulating a single x4 4 Gb DDR2 SDRAM. The 4 Gb DRAM is specified
by JEDEC as having eight banks of 512 Mbs each, so two of the 256
Mb DRAM chips may be mapped by the buffer chip to emulate each bank
(whereas in FIG. 16 one DRAM was used to emulate two banks).
[0096] However, since only one page can be open in a bank at any
given time, only one of the two DRAM chips emulating that bank can
be in the active state at any given time. If the memory controller
opens a page in one of the two DRAM chips, the other may be placed
in the precharge power down mode. Thus, if a number p of DRAM chips
are used to emulate one bank, at least p-1 of them may be in a
power down mode at any given time; in other words, at least p-1 of
the p chips are always in power down mode, although the particular
powered down chips will tend to change over time, as the memory
controller opens and closes various pages of memory.
[0097] As a caveat on the term "always" in the preceding paragraph,
the power saving operation may comprise operating in precharge
power down mode except when refresh is required.
[0098] FIG. 17 is a flow chart 1700 illustrating one embodiment of
a method of refreshing a plurality of memory circuits. A refresh
control signal is received (1702) e.g. from a memory controller
which intends to refresh an emulated memory circuit. In response to
receipt of the refresh control signal, a plurality of refresh
control signals are sent (1704) e.g. by a buffer chip to a
plurality of physical memory circuits at different times. These
refresh control signals may optionally include the received refresh
control signal or an instantiation or copy thereof. They may also,
or instead, include refresh control signals that are different in
at least one aspect (format, content, etc.) from the received
signal.
[0099] In some embodiments, at least one first refresh control
signal may be sent to a first subset of the physical memory
circuits at a first time, and at least one second refresh control
signal may be sent to a second subset of the physical memory
circuits at a second time. Each refresh signal may be sent to one
physical memory circuit, or to a plurality of physical memory
circuits, depending upon the particular implementation.
[0100] The refresh control signals may be sent to the physical
memory circuits after a delay in accordance with a particular
timing. For example, the timing in which they are sent to the
physical memory circuits may be selected to minimize an electrical
current drawn by die memory, or to minimize a power consumption of
the memory. This may be accomplished by staggering a plurality of
refresh control signals. Or, the timing may be selected to comply
with e.g. a tRFC parameter associated with the memory circuits.
[0101] To this end, physical DRAM circuits may receive periodic
refresh operations to maintain integrity of data stored therein. A
memory controller may initiate refresh operations by issuing
refresh control signals to the DRAM circuits with sufficient
frequency to prevent any loss of data in the DRAM circuits. After a
refresh control signal is issued, a minimum time tRFC may be
required to elapse before another control signal may be issued to
that DRAM circuit. The tRFC parameter value may increase as the
size of the DRAM circuit increases.
[0102] When the buffer chip receives a refresh control signal from
the memory controller, it may refresh the smaller DRAM circuits
within the span of time specified by the tRFC of the emulated DRAM
circuit. Since the IRFC of the larger, emulated DRAM is longer than
the tRFC of the smaller, physical DRAM circuits, it may not be
necessary to issue any or all of the refresh control signals to the
physical DRAM circuits simultaneously. Refresh control signals may
be issued separately to individual DRAM circuits or to groups of
DRAM circuits, provided that the tRFC requirements of all physical
DRAMs has been met by the time the emulated DRAM's tRFC has
elapsed. In use, the refreshes may be spaced in time to minimize
the peak current draw of the combination buffer chip and DRAM
circuit set during a refresh operation.
[0103] FIG. 18 illustrates one embodiment of an interface circuit
such as may be utilized in any of the above-described memory
systems, for interfacing between a system and memory circuits. The
interface circuit may be included in the buffer chip, for
example.
[0104] The interface circuit includes a system address signal
interface for sending/receiving address signals to/from the host
system, a system control signal interface for sending/receiving
control signals to/from the host system, a system clock signal
interface for sending/receiving clock signals to/from the host
system, and a system data signal interface for sending/receiving
data signals to/from the host system. The interface circuit further
includes a memory address signal interface for sending/receiving
address signals to/from the physical memory, a memory control
signal interface for sending/receiving control signals to/from the
physical memory, a memory clock signal interface for
sending/receiving clock signals to/from the physical memory, and a
memory data signal interface for sending/receiving data signals
to/from the physical memory.
[0105] The host system includes a set of memory attribute
expectations, or built-in parameters of the physical memory with
which it has been designed to work (or with which it has been told,
e.g. by the buffer circuit, it is working). Accordingly, the host
system includes a set of memory interaction attributes, or built-in
parameters according to which the host system has been designed to
operate in its interactions with the memory. These memory
interaction attributes and expectations will typically, but not
necessarily, be embodied in the host system's memory
controller.
[0106] In addition to physical storage circuits or devices, the
physical memory itself has a set of physical attributes.
[0107] These expectations and attributes may include, by way of
example only, memory timing, memory capacity, memory latency,
memory functionality, memory type, memory protocol, memory power
consumption, memory current requirements, and so forth.
[0108] The interface circuit includes memory physical attribute
storage for storing values or parameters of various physical
attributes of the physical memory circuits. The interface circuit
further includes system emulated attribute storage. These storage
systems may be read/write capable stores, or they may simply be a
set of hard-wired logic or values, or they may simply be inherent
in the operation of the interface circuit.
[0109] The interface circuit includes emulation logic which
operates according to the stored memory physical attributes and the
stored system emulation attributes, to present to the system an
interface to an emulated memory which differs in at least one
attribute from the actual physical memory. The emulation logic may,
in various embodiments, alter a timing, value, latency, etc. of any
of the address, control, clock, and/or data signals it sends to or
receives from the system and/or the physical memory. Some such
signals may pass through unaltered, while others may be altered.
The emulation logic may be embodied as, for example, hard wired
logic, a state machine, software executing on a processor, and so
forth.
CONCLUSION
[0110] When one component is said to be "adjacent" another
component, it should not be interpreted to mean that there is
absolutely nothing between the two components, only that they are
in the order indicated.
[0111] The physical memory circuits employed in practicing this
invention may be any type of memory whatsoever, such as: DRAM, DDR
DRAM, DDR2 DRAM, DDR3 DRAM, SDRAM, QDR DRAM, DRDRAM, FPM DRAM,
VDRAMI, EDO DRAM, BEDO DRAM, MDRAM, SGRAM, MRAM, IRAM, NAND flash,
NOR flash, PSRAM, wetware memory, etc.
[0112] The physical memory circuits may be coupled to any type of
memory module, such as: DIMM, R-DIMM, SO-DIMM, FB-DIMM, unbuffered
DIMM, etc.
[0113] The system device which accesses the memory may be any type
of system device, such as: desktop computer, laptop computer,
workstation, server, consumer electronic device, television,
personal digital assistant (PDA), mobile phone, printer or other
peripheral device, etc.
[0114] The various features illustrated in the figures may be
combined in many ways, and should not be interpreted as though
limited to the specific embodiments in which they were explained
and shown.
[0115] Those skilled in the art, having the benefit of this
disclosure, will appreciate that many other variations from the
foregoing description and drawings may be made within the scope of
the present invention. Indeed, the invention is not limited to the
details described above. Rather, it is the following claims
including any amendments thereto that define the scope of the
invention.
* * * * *