U.S. patent application number 12/316986 was filed with the patent office on 2010-06-24 for data error recovery in non-volatile memory.
Invention is credited to Richard Coulson, Albert Fazio, Jawad Khan.
Application Number | 20100162084 12/316986 |
Document ID | / |
Family ID | 42267898 |
Filed Date | 2010-06-24 |
United States Patent
Application |
20100162084 |
Kind Code |
A1 |
Coulson; Richard ; et
al. |
June 24, 2010 |
Data error recovery in non-volatile memory
Abstract
When an error correction code (ECC) unit finds uncorrectable
errors in a solid state non-volatile memory device, a process may
be used in an attempt to locate and correct the errors. This
process may first identify `low confidence` memory cells that are
likely to contain errors, and then determine what data is more
likely to be correct in those cells, based on various criteria. The
new data may then be checked with the ECC unit to verify that it is
sufficiently correct for the ECC unit to correct any remaining
errors.
Inventors: |
Coulson; Richard; (Portland,
OR) ; Fazio; Albert; (Saratoga, CA) ; Khan;
Jawad; (Hillsboro, OR) |
Correspondence
Address: |
INTEL CORPORATION;c/o CPA Global
P.O. BOX 52050
MINNEAPOLIS
MN
55402
US
|
Family ID: |
42267898 |
Appl. No.: |
12/316986 |
Filed: |
December 18, 2008 |
Current U.S.
Class: |
714/773 ;
711/103; 711/E12.103; 714/E11.034 |
Current CPC
Class: |
G11C 2029/0411 20130101;
G11C 16/3404 20130101; G11C 29/52 20130101; G06F 11/1068
20130101 |
Class at
Publication: |
714/773 ;
711/103; 714/E11.034; 711/E12.103 |
International
Class: |
G06F 12/16 20060101
G06F012/16; G11C 29/52 20060101 G11C029/52; G06F 11/10 20060101
G06F011/10 |
Claims
1. A method, comprising: determining that binary data read from a
specified range of sequential memory locations in a charge-based
non-volatile (NV) memory contains errors that were uncorrected by
an error correcting code (ECC) unit associated with the NV memory;
identifying which memory cells in the specified range produced data
that is likely to be in error; changing the data for at least some
of the cells whose data was determined to likely be in error; and
verifying whether the changed data is sufficiently correct for the
ECC unit to provide correct data for the specified range.
2. The method of claim 1, wherein said identifying comprises:
producing a map of charge level values for the memory cells in at
least the specified range; and comparing the charge level values in
groups of cells in the map to predefined patterns of charge levels,
to identify which of the memory cells produced binary data likely
to be in error.
3. The method of claim 1, wherein said identifying comprises:
determining analog charge values for cells in the specified range;
and determining which cells in the specified range have an analog
charge value within a predetermined voltage amount of a read
reference voltage for the NV memory.
4. The method of claim 1, wherein said changing comprises adjusting
analog charge values for the memory cells identified as likely to
be in error.
5. The method of claim 1, wherein said changing comprises
substituting random binary data for the data from the memory cells
identified as likely to be in error.
6. The method of claim 1, further comprising: incrementing a read
reference voltage through a range of voltages; reading, for each
increment, binary data from within the specified range of
sequential memory locations; and identifying which cells produce
different binary data in a current increment than for the previous
increment.
7. The method of claim 1, wherein the specified range of sequential
memory locations is a page of memory.
8. An apparatus, comprising a computer system containing a
processor and a charge-based non-volatile (NV) memory, the computer
system to perform: determining that binary data read from a
specified range of sequential memory locations in the NV memory
contains errors that were uncorrected by an error correcting code
(ECC) unit associated with the NV memory; identifying which memory
cells in the specified range produced data that is likely to be in
error; changing the data for at least some of the cells whose data
was determined to likely be in error; and verifying whether the
changed data is sufficiently correct for the ECC unit to provide
correct data for the specified range.
9. The apparatus of claim 8, wherein said identifying comprises:
producing a map of charge level values for the memory cells in at
least the specified range; and comparing the charge level values in
groups of cells in the map to predefined patterns of charge levels,
to identify which of the memory cells produced binary data likely
to be in error.
10. The apparatus of claim 8, wherein said identifying comprises:
determining analog charge values for cells in the specified range;
and determining which cells in the specified range have an analog
charge value within a predetermined voltage amount of a read
reference voltage for the NV memory.
11. The apparatus of claim 8, wherein said changing comprises
adjusting analog charge values for the memory cells identified as
likely to be in error.
12. The apparatus of claim 8, wherein said changing comprises
substituting random binary data for the data from the memory cells
identified as likely to be in error.
13. The apparatus of claim 8, further comprising: incrementing a
read reference voltage through a range of voltages; reading, for
each increment, binary data from within the specified range of
sequential memory locations; and identifying which cells produce
different binary data in a current increment than for the previous
increment.
14. An article comprising a tangible computer-readable medium that
contains instructions, which when executed by one or more
processors result in performing operations comprising: determining
that binary data read from a specified range of sequential memory
locations in a charge-based non-volatile (NV) memory contains
errors that were uncorrected by an error correcting code (ECC) unit
associated with the NV memory; identifying which memory cells in
the specified range produced data that is likely to be in error;
changing the data for at least some of the cells whose data was
determined to likely be in error; and verifying whether the changed
data is sufficiently correct for the ECC unit to provide correct
data for the specified range.
15. The article of claim 14, wherein the operation of identifying
comprises: producing a map of charge level values for the memory
cells in at least the specified range; and comparing the charge
level values in groups of cells in the map to predefined patterns
of charge levels, to identify which of the memory cells produced
binary data likely to be in error.
16. The article of claim 14, wherein the operation of identifying
comprises: determining analog charge values for cells in the
specified range; and determining which cells in the specified range
have an analog charge value within a predetermined voltage amount
of a read reference voltage for the NV memory.
17. The article of claim 14, wherein the operation of changing
comprises adjusting analog charge values for the memory cells
identified as likely to be in error.
18. The article of claim 14, wherein the operation of changing
comprises substituting random binary data for the data from the
memory cells identified as likely to be in error.
19. The article of claim 14 wherein the operations further
comprise: incrementing a read reference voltage through a range of
voltages; reading, for each increment, binary data from within the
specified range of sequential memory locations; and identifying
which cells produce different binary data in a current increment
than for the previous increment.
20. The article of claim 14, wherein the specified range of
sequential memory locations is a page of memory.
Description
BACKGROUND
[0001] Some types of solid state non-volatile memory, such as flash
memory, record binary data by storing a certain amount of
electrical charge in a memory cell. When the data is read from one
of these charge-based non-volatile memories, the voltage level of
the stored charge is compared to a reference voltage. The binary
value of the data read from that cell depends on whether the
voltage of the stored charge is higher or lower than the reference
voltage. However, since the stored charge is an analog phenomenon,
its actual value may not be exactly what was intended, and errors
may be encountered when the data is read. Error correcting code
(ECC) units may be used to detect and correct some of these errors,
but sometimes the errors are too numerous to all be corrected in
this manner. When this happens, the data may be permanently
lost.
BRIEF DESCRIPTION OF THE DRAWINGS
[0002] Some embodiments of the invention may be understood by
referring to the following description and accompanying drawings
that are used to illustrate embodiments of the invention. In the
drawings:
[0003] FIG. 1 shows a system containing a solid state non-volatile
memory, according to an embodiment of the invention.
[0004] FIG. 2 shows a portion of a non-volatile memory array,
according to an embodiment of the invention.
[0005] FIG. 3 shows a flow diagram of a method of correcting errors
in a memory, according to an embodiment of the invention.
[0006] FIG. 4 shows an example charge pattern that may affect a
target cell, according to an embodiment of the invention.
[0007] FIG. 5 shows a flow diagram of a method of pattern-matching
to identify low- confidence cells in a NV memory array, according
to an embodiment of the invention.
[0008] FIG. 6 shows a flow diagram of a method of using proximity
to reference voltages as a way to identify low-confidence cells in
a NV memory array, according to an embodiment of the invention.
[0009] FIG. 7 shows a flow diagram of a method of correcting data
in a NV memory through adjustment of analog charge values,
according to an embodiment of the invention.
DETAILED DESCRIPTION
[0010] In the following description, numerous specific details are
set forth. However, it is understood that embodiments of the
invention may be practiced without these specific details. In other
instances, well-known circuits, structures and techniques have not
been shown in detail in order not to obscure an understanding of
this description.
[0011] References to "one embodiment", "an embodiment", "example
embodiment", "various embodiments", etc., indicate that the
embodiment(s) of the invention so described may include particular
features, structures, or characteristics, but not every embodiment
necessarily includes the particular features, structures, or
characteristics. Further, some embodiments may have some, all, or
none of the features described for other embodiments.
[0012] In the following description and claims, the terms "coupled"
and "connected," along with their derivatives, may be used. It
should be understood that these terms are not intended as synonyms
for each other. Rather, in particular embodiments, "connected" is
used to indicate that two or more elements are in direct physical
or electrical contact with each other. "Coupled" is used to
indicate that two or more elements co-operate or interact with each
other, but they may or may not be in direct physical or electrical
contact.
[0013] As used in the claims, unless otherwise specified the use of
the ordinal adjectives "first", "second", "third", etc., to
describe a common element, merely indicate that different instances
of like elements are being referred to, and are not intended to
imply that the elements so described must be in a given sequence,
either temporally, spatially, in ranking, or in any other
manner.
[0014] Various embodiments of the invention may be implemented in
one or any combination of hardware, firmware, and software. The
invention may also be implemented as instructions contained in or
on a computer-readable medium, which may be read and executed by
one or more processors to enable performance of the operations
described herein. A computer-readable medium may include any
mechanism for storing, transmitting, and/or receiving information
in a form readable by one or more computers. For example, a
computer-readable medium may include a tangible storage medium,
such as but not limited to read only memory (ROM); random access
memory (RAM); magnetic disk storage media; optical storage media; a
flash memory device, etc. A computer-readable medium may also
include a propagated signal which has been modulated to encode the
instructions, such as but not limited to electromagnetic, optical,
or acoustical carrier wave signals.
[0015] In various embodiments, if uncorrectable errors are found
when reading data from a portion of charge-based non-volatile
memory, a process may be followed to attempt to correct the data.
Since the exact location of the bad data may not be known, this
process may comprise 1) identifying `low confidence` (LC) storage
cells in that portion of memory (i.e., cells that are more likely
than other cells to contains errors), 2) determining what data in
those cells is likely to be correct, and 3) verifying that the new
data is correct. Identifying low confidence cells may be done in
either of two ways: 1) find cells whose analog charge voltage is
close to a reference voltage, or 2) look for particular patterns of
charge levels in the surrounding cells that are known to cause data
corruption in a target cell. Determining new data that is likely to
be correct in the low confidence cells may be done in either of two
ways: 1) adjust the analog charge value in those cells in a
direction and amount that seems best, or 2) try random values of
data in the LC cells. Verifying that the new data is correct may be
performed in any suitable manner, but may typically be done by
using an error checking and correction (ECC) algorithm, since this
will produce valid data as long as the number of errors is below a
certain threshold. Once the correct values of the data have been
determined, the data may be re-written to another location, where
it hopefully will not experience the same problems of data
corruption. Although computationally intensive, this process may be
useful for recovering data that is otherwise impractical to recover
in other ways, such as data in a solid state disc (SSD) that
contains fatal errors in important data.
[0016] FIG. 1 shows a system containing a solid state non-volatile
memory, according to an embodiment of the invention. The
illustrated system 100 comprises a processor 110, a main memory
120, input-output logic 130, and a non-volatile (NV) memory 140. In
this particular implementation, the NV memory is attached as an I/O
device (such as but not limited to a solid-state disk), but other
embodiments may place the NV memory elsewhere in the system, such
as but not limited to a part of the main memory itself, a cache
memory working in cooperation a hard disk drive, etc. Various
embodiments of the invention should be usable in diverse
applications, and in different parts of a system, whether or not
those applications and parts are specifically described here.
[0017] The NV memory may employ any feasible type of NV storage
technology that uses stored charge to store data, and uses one or
more reference voltages for read operations. It may be particularly
useful in NV memory that reads an entire range of sequential memory
locations with a single read command (such as but not limited to
reading a page of memory from a NAND flash memory array), rather
than reading an individual byte or word with a single read
command.
[0018] In the illustrated embodiment of FIG. 1, NV memory 140 may
comprise a storage array 148 and a controller 142 to control
operations with the array such as read, write, erase, and
adjustment of reference voltages. The controller 142 may be further
separated into other functional units, such as error checking and
correction (ECC) unit 143, error analysis unit 144, and reference
voltage control unit 145. The reference voltage control unit 145
may provide each of the one or more reference voltages used when
reading data from the memory cells in the array. The reference
voltage control unit 145 may also adjust these reference voltages
up and/or down as needed to perform the operations described
herein. Although the various units in NV memory 140 are shown as
separate functional units, two or more of them may share common
circuitry and/or code.
[0019] Whenever the controller 142 receives a read request from the
processor 110 or other device, the controller 142 may initiate an
operation that reads data from multiple sequential locations in the
memory array. The starting address of the locations may be
indicated by the read request, while the number of locations may be
specified in the request or may be predefined in some other manner.
As the data is read and placed in a buffer, the ECC unit may detect
errors in the data, keep track of those errors, and correct the
errors that it is able to correct through its error-correction
algorithm. When non-correctable errors (i.e., not correctable by
the ECC unit) are detected in this manner, the data from an entire
range of sequential addresses (e.g., a page or a sector of memory)
may be designated as being incorrect, since the quantity and
location of the uncorrectable errors within that range are
unknown.
[0020] FIG. 2 shows a portion of a non-volatile memory array,
according to an embodiment of the invention. The illustrated
example shows a common representation of a flash memory array, in
which each transistor represents a memory cell in which the amount
of stored charge near the gate represents the data stored in that
cell. Types of memory other than flash memory, and circuits other
than the one shown, may also be included in the various embodiments
of the invention. In the particular example shown, each of the
horizontal lines represents a word line connecting the gates of
multiple cells in that row (in this example approximately 32
thousand cells, but that number may differ in other embodiments),
while the cells in each bit line column are connected to each of
the adjacent cells in that column. The cells may represent single
level cell (SLC) or multilevel cell (MLC) technology. As the terms
are used in this document, SLC refers to technology that stores
only one bit per cell, while MLC refers to technology that stores
two or more bits per cell. The illustrated array represents a
four-level MLC technology, where the range of possible charge is
divided into four sub-ranges, labeled L0 through L3, and each
sub-range represents a different 2-bit combination (e.g., 11, 10,
00, or 01). If the charge in a particular cell falls within one of
those four sub-ranges, the cell is considered to be storing the
corresponding 2-bit binary value. Example values for 15 of the
cells are given, with an unknown charge Vt on a 16.sup.th cell.
[0021] For the following discussions, it is assumed that an ECC or
other type of error detection and correction algorithm has been
used, but it cannot correct all the errors in a given range of
sequential memory addresses. In such cases, it is known that the
errors are contained within this range, but it is not known exactly
which addresses (and therefore which cells) contain the errors. In
many cases it is not known how many errors there are, except that
the number of errors exceeds the ability of the ECC to correct them
all. For this discussion, it is assumed that the entire page is
considered to be `failed` because the ECC unit could not correct
all the errors in that page, but units other than a page (e.g., a
sector) may also be examined in this manner if the ECC code block
size renders the failed unit a different size than a page. The
following definitions are used in this document:
[0022] 1) analog charge value--a value that represents the voltage
for the charge stored in a particular cell. Although this value may
be expressed as a discrete digital or binary number for processing,
it represents the analog charge value and is therefore labeled as
an analog value.
[0023] 2) charge level value--a value that represents one of the
sub-ranges of charge level in the cell, in which each sub-range
represents a different binary data value. For example, in a
four-level MLC, a charge level value of 0 may represent the lowest
sub-range (least amount of charge), 1 the next higher sub-range, 2
the next higher sub-range, and 3 the highest sub-range (greatest
amount of charge).
[0024] 3) binary data value--the binary value of the data that is
being stored in a cell. For example, in a four-level MLC, a binary
11 may be represented by charge level value of 0, a binary 10 may
be represented by a charge level value of 1, a binary 00 may be
represented by a charge level value of 2, and a binary 01 may be
represented by a charge level value of 3. These particular
conversion values may be advantageous because a transition from one
charge level to the next only changes one bit of the equivalent
binary value, reducing the uncertainly of a borderline reading to
only two possibilities. However, other conversion values may also
be used.
[0025] FIG. 3 shows a flow diagram of a method of correcting errors
in a memory, according to an embodiment of the invention. In flow
diagram 300, operations 310, 314, 318 and operations 311, 315
represent two different ways to identify low confidence (LC) cells
in a page of data that is considered failed. Operations 320, 324,
328 and operation 321 represent two different ways to attempt to
correct the data in those LC cells. The verification operation at
330 may be the same, regardless of which combination of
identification and correction processes are used.
Identification Through Pattern Matching
[0026] This process corresponds with operations 310, 314, 318 of
FIG. 3. With the ever-smaller geometries used in memory arrays, the
amount of charge stored in each cell may be affected by the amount
of charge stored in the surrounding cells, pulling that charge up
or down from what it was intended to be. Referring back to FIG. 2
for an example, the cell with a charge level of Vt is the `target`
cell being investigated by examining the possible effect of the
surrounding cells on Vt. Arrows indicate an example of the possible
cells whose charge level might affect the charge Vt because of
their physical proximity or electrical connection to the target
cell. The exact nature of this effect, and the particular cells
that might cause it, may depend on the specifics of the array, such
as physical closeness of the cells, the cell structure, the
semiconductor materials used, the actual charge level in each of
the surrounding cells and in the target cell, etc. Such
interrelationships may be determined for each type of array, and
for each pattern of charge distribution in the cells.
[0027] FIG. 4 shows an example charge pattern that may affect a
target cell, according to an embodiment of the invention. For the
purposes of this document, a `pattern` is a group of cells with a
predefined physical arrangement, with predefined electrical
connections to each other, and predefined charge levels within
those cells. A target cell is a cell that occupies a predefined
place within the physical arrangement, and also is the cell that is
being examined to determine the possible effect of other cells in
the pattern upon it. A particular pattern may be defined based on
the charge levels in all the cells in the pattern, including the
charge level in the target cell. For example, in FIG. 4 the cells
on both sides of the target cell (adjacent cells on the same word
line) have a charge level of L3, while the cells immediately above
and below the target cell (adjacent cells on the same bit line)
have charge level of L2. If the correct charge level in the target
cell is supposed to be L0, these surrounding cells might pull that
charge in the target cell to an L1 level (as shown), and so having
a charge of L1 in the target cell is part of this pattern. On the
other hand, if the charge level in the target cell is supposed to
be L2 or L3, the charge in the surrounding cells may be close
enough to that level that they have little affect on it, and so
that arrangement would not be considered a predefined pattern to be
searched for. In some embodiments, the charge level for any given
cell in the pattern might cover more than one level. Using the
pattern of FIG. 4 as an example, any of the four cells adjacent to
the target cell might be allowed to have a charge level of either
L2 or L3, and that would still be considered as matching the
pattern.
[0028] This example shows a simple five-cell pattern (the target
cell and the adjacent cells on the same word line and bit line)
with a specific charge distribution, but other patterns may involve
different arrangements of cells around the target cell and/or
different quantities of cells and/or different charge distribution
in the cells. Although the same set of patterns may be expected to
apply to large portions of the array, some patterns may be
applicable only to specific portions of the array (e.g., a target
cell at the edge of the array would have no adjacent cell on one
side, so a different set of patterns might be used for cells at the
edge). Because each cell might be examined separately with this
pattern-matching technique, a given cell might be considered a
target cell in one instance, but be one of the surrounding cells
when another cell is being targeted.
[0029] FIG. 5 shows a flow diagram of a method of pattern-matching
to identify low-confidence cells in a NV memory array, according to
an embodiment of the invention. In the described method, it is
assumed that a particular page of data in the memory has one or
more errors that were uncorrectable by the ECC unit, and that these
errors were at least partly caused by the type of charge
distribution effects previously described, so that the correct data
may be determined by analyzing that charge distribution. For this
discussion, it is assumed that an entire page is considered failed
because the ECC unit could not correct all the errors in that page,
but units other than a page (e.g., a sector) may also be examined
in this manner if the ECC code block size renders the failed unit a
different size than a page.
[0030] In the illustrated flow diagram 500, at 510 the binary data
in the entire erase block containing the failed page may be read.
Although the entire page is considered `failed`, (because it is
known to have errors but the locations of the errors within the
page are unknown), it is still possible to read data from all cells
in the page, even though some of that data will be incorrect. It
may be desirable to read the entire erase block, rather than just
the failed page, because of the way that NV memories are typically
laid out, which permits cells in the correctly-read pages in the
erase block to affect cells in the failed page in the same erase
block.
[0031] At 520, the binary data read from the erase block may be
converted to charge level values. The exact manner of this
conversion may depend on how many charge levels (and therefore how
many bits of binary data) are contained in each cell, and on how
each of those levels indicates a particular binary data value. The
results of this conversion may be placed into a charge level map
that contains the charge level value for each cell in the erase
block. The map may also indicate (e.g., through the organization of
the map) how those cells relate to each other, physically and/or
electrically, so that they may be grouped together into groups that
are meaningful for the subsequent pattern-matching process. One
such manner of organization, though not the only manner, is to
organize the map into a two-dimensional array that reflects the row
and column electrical layout of the cells in the erase block.
[0032] At 530 the actual pattern-matching process may be done. The
earlier description of FIG. 4 provides one example of this
pattern-matching process. Each cell that is a target cell in a
matched pattern may be designated a low-confidence cell at 540. Not
all cells in the erase block need to be examined for patterns that
have that cell as the target cell. Only groups of cells in which
the target cell is in the failed page need to be examined for
pattern matching. Since pages are frequently assigned to
alternating bit lines, this may simplify the pattern matching
process. Referring to FIG. 4, the center bit line is in the failed
page, but the bit lines to the left and right are in another page
which the ECC has shown to contain correct data. So pattern
matching does not need to be applied to groups of cells in which
the target cell is not in the failed page. This also reduces the
chances that another cell in the pattern (other than the target
cell) contains incorrect data, since in this example only those
cells on the same bit line as the target cell are in the failed
page.
Identification through Reference Voltage Proximity
[0033] This process corresponds to operations 311, 315 of FIG. 3.
When the analog value of the charge stored in a cell is very close
to the reference voltage that is used when reading that cell, it is
very easy for the voltage of the charge to end up on the wrong side
of that reference voltage and produce an incorrect reading. The
process of FIG. 6 takes advantage of this phenomenon.
[0034] FIG. 6 shows a flow diagram of a method of using proximity
to a reference voltage as a way to identify low-confidence cells in
a NV memory array, according to an embodiment of the invention. In
flow diagram 600, a process referred to herein as a `moving read
reference` (MRR) may be used to measure the analog charge voltage
in each cell. At 610, each read reference voltage may be
incremented through a range of values. This incremental process may
repeatedly cycle through operations 610-650 until the MRR process
is complete, as determined at 620. Any feasible technique may be
used to increment the reference voltage(s). For multi-level cell
technology, in which there are multiple reference voltages, some
embodiments may increment only a single reference voltage for each
new cycle, while other embodiments may increment multiple reference
voltages for each new cycle. Some embodiments may increment all the
reference voltages for each new cycle. Some embodiments may switch
between these alternatives on difference cycles.
[0035] After each increment, the binary value of data stored in
each cell in the failed page may be read at 630. If the data read
from a given cell is different than it was in the previous pass, as
indicated at 640, that indicates that the incremented reference
voltage has just crossed over the analog charge value, and the
analog charge value must be very close to the reference voltage
(within the range of one increment). This value may then be
recorded at 650 in a map of analog charge values. This may be
performed for every cell that shows changed data from the previous
pass. In MLC memories that have multiple reference voltages, the
particular before-and-after binary values may need to be examined
to determine which reference voltage was crossed, so the proper
voltage may be recorded. The process of 610-650 may be repeated
until the analog charge values for all the cells in the failed page
have been recorded.
[0036] If all the incremented values of the reference voltages have
been tried, but not all the cells in the page have a recorded value
for their analog charge, then the unrecorded cells may have an
analog charge value that is outside the tested ranges (in which
case the ranges may be expanded for further testing), or the cell
has failed completely (in which case other corrective actions, not
described here, may be taken). Assuming that all the cells have
recorded values, the process may move from operation 620 to
operation 660, where the reference voltages may be restored to
their original values so that normal read operations may take
place.
[0037] At this point, the analog charge value that was recorded for
each cell may be compared to the restored reference voltages at
670. (The restored values of the reference voltages may be
recorded, and the recorded values used in this comparison, so that
further accesses to the actual cells will not be necessary.) If any
cell has an analog charge value that is close to a restored
reference voltage, that cell may be identified at 680 as a low
confidence cell. Just how close the analog charge value needs to be
to a reference voltage before it is considered `low confidence` may
depend on various factors. In some embodiments the range of analog
charge values that are considered `close` to the reference voltage,
and/or the center point of this range, may be changed during
processing. For example, a fairly narrow range of voltages may be
used at first. If this does not produce satisfactory results, the
range may be enlarged in an attempt to include more cells in the
`low confidence` category. Several iterations of these changes may
be made before satisfactory results are obtained.
Correction through Charge Adjustment
[0038] This process corresponds with operations 320, 324, 328 of
FIG. 3. In general, it involves adjusting the analog charge value
of a low confidence cell in a manner that is projected to improve
the likelihood of getting correct data from the cell.
[0039] FIG. 7 shows a flow diagram of a method of correcting data
in a NV memory through adjustment of analog charge values,
according to an embodiment of the invention. To adjust analog
charge values for low-confidence cells in the failed page, the
starting values of those analog charge values must be known. If the
MRR process of operations 610-650 in FIG. 6 has already been
performed to identify low confidence cells, then those starting
values are already known. If it has not, the MRR process may be
performed before starting the method of flow diagram 700.
[0040] For each low-confidence cell, at 710 the amount (change the
amount of charge, which changes the analog charge value), and the
direction (add or subtract charge) of the projected adjustment to
the cell may be determined. Any feasible method may be used to
determine the direction and amount of this adjustment. At 720, this
adjustment may be made to the analog charge value. Rather than
changing the actual charge that exists in the physical cell, this
process may be performed mathematically on the recorded analog
charge values in the analog charge map that was previously
constructed for the failed page.
[0041] The number of low confidence cells to adjust in this manner
before performing a verification may depend on numerous factors.
Adjusting all the low confidence cells (or at least a large number
of cells) runs the risk of changing previously correct data into
bad data, potentially making the problem worse. Adjusting a small
number of cells before verification runs the risk of not changing
enough bad cells to get a valid ECC result during verification, and
thus not knowing if the changes were correct or not. However the
number of cells to change might be determined, after that number of
cells has been changed, as determined at 730, the process may move
to 740 where the new analog charge values for the low confidence
cells are converted into their equivalent binary data.
Correction through Random Data Substitution
[0042] This process corresponds with operation 321 of FIG. 3. In
general, it involves trying random data in the low confidence
cells, until a combination is found that produces a page with so
few errors that the errors can be corrected through other means,
such as with an ECC process. As described for the process of FIG.
7, the number of cells to change in this manner before attempting
verification may vary. In some embodiments, the preferred number of
cells to change may depend on various factors, such as but not
limited to: 1) which correction technique is being used, 2) the
number of cells contained in the failed address range, 3) etc.
Verification
[0043] Verification corresponds with operation 330 of FIG. 3.
Verification of the data in the page may be performed through any
feasible manner, such as but not limited to processing the page of
binary data through an ECC process that will correct the remaining
errors, provided the quantify of errors is not beyond the ability
of the ECC to correct. If the corrected data is verified in this
manner, the corrected data may be written to a storage location
where it can be used. This would presumably not be to the same page
in the same erase block in which it originally failed, since
defects in that physical storage location may have caused the
errors in the first place. On the other hand, if the new page of
data still has uncorrectable errors, various procedures may be
followed, such as but not limited to one or more of: 1) assume the
data in the page is lost, and discard the data, 2) re-run the
previous operations on the new or original data, but use different
parameters to correct the data, 3) re-run the previous operations
on different low confidence cells, 4) run different detection
operations to determine the low confidence cells, 5) etc.
[0044] In some embodiments these operations may be performed within
the controller of the NV memory (e.g., by the error analysis unit
144 of FIG. 1). In other embodiments these operations may be
performed external to the NV memory (e.g., by one or more main
processors in the computer system). In still other embodiments
these operations may be performed by multiple such devices (e.g.,
some operations in the NV memory controller, some operations by the
main processor(s). These are only a few of the ways in which the
operations may be performed.
[0045] The foregoing description is intended to be illustrative and
not limiting. Variations will occur to those of skill in the art.
Those variations are intended to be included in the various
embodiments of the invention, which are limited only by the scope
of the following claims.
* * * * *