U.S. patent application number 11/516824 was filed with the patent office on 2007-01-04 for buffered continuous multi-drop clock ring.
Invention is credited to James A. McCall, Clinton F. Walker.
Application Number | 20070002676 11/516824 |
Document ID | / |
Family ID | 35610192 |
Filed Date | 2007-01-04 |
United States Patent
Application |
20070002676 |
Kind Code |
A1 |
McCall; James A. ; et
al. |
January 4, 2007 |
Buffered continuous multi-drop clock ring
Abstract
A method, system and apparatus to distribute a clock signal
among a plurality of memory units in a memory architecture. A
buffer chip is coupled to a plurality of memory units each by a
point to point link. The buffer chip includes a clock generator to
generate a continuous free running clock that may be passed
serially through a subset of memory units in the architecture.
Sending of data is delayed over the point to point links based on
proximity of the memory units to the buffer chip to accommodate
delay in the multidrop clock signal.
Inventors: |
McCall; James A.;
(Beaverton, OR) ; Walker; Clinton F.; (Portland,
OR) |
Correspondence
Address: |
BLAKELY SOKOLOFF TAYLOR & ZAFMAN
12400 WILSHIRE BOULEVARD
SEVENTH FLOOR
LOS ANGELES
CA
90025-1030
US
|
Family ID: |
35610192 |
Appl. No.: |
11/516824 |
Filed: |
September 6, 2006 |
Related U.S. Patent Documents
|
|
|
|
|
|
Application
Number |
Filing Date |
Patent Number |
|
|
10956397 |
Sep 30, 2004 |
|
|
|
11516824 |
Sep 6, 2006 |
|
|
|
Current U.S.
Class: |
365/189.16 |
Current CPC
Class: |
Y02D 10/151 20180101;
Y02D 10/00 20180101; G06F 13/4243 20130101; G06F 1/10 20130101;
Y02D 10/14 20180101 |
Class at
Publication: |
365/233 |
International
Class: |
G11C 8/00 20060101
G11C008/00 |
Claims
1. An apparatus comprising: a plurality of memory units; and a
buffer to communicate over a plurality of point to point data
lanes, one data lane to each of the plurality of memory units and
to forward a continuous clock serially through each memory unit to
drive the plurality of data lanes wherein the buffer comprises at
least one time shifter.
2. The apparatus of claim 1 wherein the at least one time shifter
comprises: a plurality of time shifters to shift a timing of data
transmitted on the point to point data lanes based on a proximity
of the memory unit to the buffer.
3. The apparatus of claim 2 wherein each time shifter comprises: a
delay lock loop.
4. The apparatus of claim 1 wherein each memory unit comprises a
dynamic random access memory.
5. The apparatus of claim 1 wherein each data lane is 8 bits
wide.
6. The apparatus of claim 1 wherein the buffer comprises: a clock
generator to provide a free running clock.
7. A method comprising: generating a continuous clock signal;
forwarding the clock signal serially through a plurality of memory
units in decreasing proximity to a clock source; and deskewing the
clock signal relative to a data signal over a point to point link
from a memory unit to the clock source.
8. The method of claim 7 further comprising: supplying data to a
memory unit over a point to point link in quadrature with the clock
signal.
9. The method of claim 8 wherein the supply comprises: delaying
data delivery on a point to point link to a memory unit of the
plurality based on proximity of the memory unit to the clock
source.
Description
BACKGROUND
[0001] This patent application is a continuation of pending U.S.
patent application Ser. No. 10/956,397, filed on Sep. 30, 2004,
entitled, BUFFERED CONTINUOUS MULTI-DROP CLOCK RING.
FIELD OF THE INVENTION
[0002] Embodiments of the invention relate to power and performance
in computer memory systems. More specifically, embodiments of the
invention relate to providing a clocking signal within a memory
subsystem.
BACKGROUND
[0003] The power performance relationship in the personal computer
(PC) environment continues to pressure platform designers to
improve power at minimal cost. Unfortunately, to accommodate legacy
dynamic random access memory (DRAM) using the industry standard
double data rate 2 (DDR2) feature set early fully buffered dual in
line memory modules (DIMM) (FBD) require higher power levels and
prior evolutionary approaches as a result of the addition of a
buffer chip. This feature set is defined in JEDEC Standard DDR2
SDRAM Specification JESD79-2A, published Jan. 2004 (the DDR2
Standard). Moreover, the DDR2 feature set limited the ability to
enable features in the buffer-DRAM interface to reduce power and
improve performance at lower cost.
[0004] Existing designs use an architecture with bi-directional
strobes generated from the buffer chip to the DRAM. In this design,
one output strobe is required per DRAM, the strobe design results
in timing problems at higher speeds which is due to the uncertainty
caused by drift effects between issue commands and N unit intervals
until it is executed. While a steady state clock eliminates this
uncertainty, it would cause the pin count to increase by two times
at both the DRAM and the buffer chip. Such increased pin count
results in increased cost and power dissipation.
BRIEF DESCRIPTION OF THE DRAWINGS
[0005] The invention is illustrated by way of example and not by
way of limitation in the figures of the accompanying drawings in
which like references indicate similar elements. It should be noted
that references to "an" or "one" embodiment in this disclosure are
not necessarily to the same embodiment, and such references mean at
least one.
[0006] FIG. 1 is a block diagram of a system of one embodiment of
the invention.
[0007] FIG. 2 is a timing diagram of timeshifting data to
accommodate a resulting timeshift in a free running clock in one
embodiment of the invention.
[0008] FIG. 3 is a timing diagram of an example of the free running
clock in one embodiment of the invention.
DETAILED DESCRIPTION
[0009] FIG. 1 is a block diagram of a system of one embodiment of
the invention. A processor 102 is coupled by a system bus 104 to
chipset 106. Chipset 106 provides an interface between the
processor 102 and input/output (I/O) devices 108 via an I/O bus
110. Additionally, chipset 106 includes a memory controller 112
which communicates over a high speed link 114 to a buffer chip 120
of a dual inline memory module (DIMM) 100. In an alternative
embodiment a single inline memory module (SIMM) may be used.
[0010] DIMM 100 maybe inserted into a memory card slot with a
motherboard not shown. DIMM 100 includes two banks of memory units,
a first bank (right bank) including dynamic random access memories
142-1 through 142-4 (collectively DRAM 142), and a second bank
(left bank) including DRAMS 152-1 through 152-4, (collectively DRAM
152). More or fewer memory units may exist in each bank of memory
units. In an alternative embodiment a single inline memory module
(SIMM) may be used. Buffer chip 120 controls the reading and
writing from the plurality of memory units, e.g., DRAMs 142 and
152. Buffer chip 120 maybe an integrated circuit (IC) fabricated
using any conventional or subsequently developed technology.
[0011] Buffer chip 120 includes at least one clock generator 122 to
generate and source a free running (continuous) clock signal. In
one embodiment, separate clock generators exist for each bank of
memory units. In another embodiment, the clock continuous signal
from a single clock generator 122 is split and supplied to both
banks of memory units.
[0012] In one embodiment, a clock signal is distributed serially
through a subset of the memory units, e.g., DRAMs 142 along
clockline 140. In one embodiment, the clock signal is passed in a
ring serially through DRAM 142-1 to DRAM 142-2 to DRAM 142-3 to
DRAM 142-4 and back through DRAM 142-4, DRAM 142-3, DRAM 142-2,
DRAM 142-1 and then returns to the buffer chip 120. In one
embodiment, the clock serves as a write clock as it moves through
the memory units in decreasing proximity to the buffer chip 120 and
serves as a read clock as it returns with increasing proximity to
the buffer chip 120.
[0013] A point to point link between the buffer chip and each DRAM
also exists. This point to point link is a path by which data may
be sent to each DRAM. This path is also referred to herein as a
data lane. In one embodiment, each datalane is 8 bits wide. Thus,
data lanes 162-1 through 162-4 (collectively 162) and 172-1 through
172-4 (collectively 172) are shown. Use of the free running
multi-drop clock reduces the pin count on both the DRAMs and the
buffer chip over prior art strobing methods. However, the
multi-drop clock topology results in a delay of the arrival of the
clock signal at the DRAMs relative to the arrival of data
(D1.times.8 through D4.times.8) over the point to point link. This
delay increases with increasing distance from (decreasing proximity
to) the buffer chip 120. Thus, the clock signal, assuming it is
concurrently sent in quadrature with the data on data lane 162-4,
would have a relationship furthest from quadrature when it arrives
at DRAM 142-4. However, by providing timeshifters 124-1 through
124-4 (collectively 124) to timeshift data sent over datalanes 162,
quadrature synchronization can be achieved at each of the inline
memory units. Because the distance is known and the delay for each
drop can be simulated, the delay for each timeshifter can be
established in advance using delay lock loops 160-1 through 160-4.
In one embodiment, timeshifter 124-1 may be omitted since the
signal should arrive at the first DRAM, the substantially same
relationship as it had departing the buffered chip 120. In another
embodiment, timeshifters 124 may only be used for data lanes where
the clock delay is determined to be likely to cause errors in
writing valid data.
[0014] Similarly, the read clock is provided as a clock signal
returns through each memory unit in series. Thus, for example, the
read will be initiated at point 158. However, the clock signal will
not return to the buffer chip 120 until after the read data is
received at the buffer chip over datalane 172-4. Thus, it is
necessary to delay the read data to synchronize with the returning
clock. Deskew logic 126 provides for the deskewing of the phase
relationship of the received data (D1.times.8 through D4.times.8)
and the returning clock signal on signal line 150. A plurality of
delay lock loops (DLLs) may be employed to appropriately delay the
clock to deskew this phase relationship. This ensures valid data
(D1.times.8 through D4.times.8) will be returned to the memory
controller 112 for use by the processor or other requesting
device.
[0015] While the read operation has been described relative to the
lefthand bank of memory units and the write operations have been
described relative to the righthand bank of memory units, it should
be understood that reading and writing occur over both banks of
memory units and may be performed analogously on either side of the
DIMM 100. Thus, in one embodiment, deskew logic is duplicated and
is available for use by each bank of memory units. Similarly,
timeshifters may be supplied for each bank of memory units.
Moreover, as noted above, in one embodiment, two clock generators
exist on buffer chip 120, one to supply a clock over signal line
140 and one to supply a clock over signal line 150. In another
embodiment, a single clock generator is used to supply clocks over
both signal line 140 and signal line 150.
[0016] FIG. 2 is a timing diagram of timeshifting data to
accommodate a resulting timeshift in a free running clock in one
embodiment of the invention. As can be seen, the clock at buffer
chip has a quadrature relation with the data. However, as the clock
signal transitions through each successive memory unit, the
timeshift T.sub.1SFT/T.sub.2SFT/T.sub.3SFT, T.sub.4SFT becomes
increasingly great. Thus, if the data were sent over the data lanes
concurrently with the clock leaving the buffer, the memory units
more distal to the buffer chip would be increasingly likely to
write invalid data. Thus, within the buffer chip, a timeshift of
the data is introduced to insure that the quadrature relationship
between the clock at the memory module and the receipt of valid
data is maintained.
[0017] FIG. 3 is a timing diagram of an example of the free running
clock in one embodiment of the invention. The clock first appears
recirculated at the memory unit most distant from the buffer chip.
Because the memory unit does not have logic to insure any
particular phase relationship with the clock, the memory unit
places the data on the data lane in response to receipt of the
clock without concern for phase relation/clock time. A decreasing
clock skew relative to the data returned occurs as the clock
returns to the buffer in increasing proximity for each successive
memory unit. At the buffer, deskew logic insures the quadrature
phase relationship by delaying the data from the respective memory
units times T.sub.4, T.sub.3, T.sub.2 and T.sub.1 respectively. In
this manner, deskew logic on the buffer chip insures valid data
capture at the buffer chip.
[0018] In the foregoing specification, the invention has been
described with reference to specific embodiments thereof. It will,
however, be evident that various modifications and changes can be
made thereto without departing from the broader spirit and scope of
the invention as set forth in the appended claims. The
specification and drawings are, accordingly, to be regarded in an
illustrative rather than a restrictive sense.
* * * * *