U.S. patent application number 16/887804 was filed with the patent office on 2020-12-03 for method and system for repairing reed-solomon codes.
This patent application is currently assigned to The Regents of the University of California. The applicant listed for this patent is The Regents of the University of California. Invention is credited to Hamid JAFARKHANI, Weiqi LI, Zhiying WANG.
Application Number | 20200382141 16/887804 |
Document ID | / |
Family ID | 1000005006415 |
Filed Date | 2020-12-03 |
![](/patent/app/20200382141/US20200382141A1-20201203-D00000.png)
![](/patent/app/20200382141/US20200382141A1-20201203-D00001.png)
![](/patent/app/20200382141/US20200382141A1-20201203-D00002.png)
![](/patent/app/20200382141/US20200382141A1-20201203-D00003.png)
![](/patent/app/20200382141/US20200382141A1-20201203-D00004.png)
![](/patent/app/20200382141/US20200382141A1-20201203-D00005.png)
![](/patent/app/20200382141/US20200382141A1-20201203-D00006.png)
![](/patent/app/20200382141/US20200382141A1-20201203-D00007.png)
![](/patent/app/20200382141/US20200382141A1-20201203-M00001.png)
![](/patent/app/20200382141/US20200382141A1-20201203-M00002.png)
![](/patent/app/20200382141/US20200382141A1-20201203-M00003.png)
View All Diagrams
United States Patent
Application |
20200382141 |
Kind Code |
A1 |
WANG; Zhiying ; et
al. |
December 3, 2020 |
Method and System for Repairing Reed-Solomon Codes
Abstract
Methods and devices are provided for error correction of
distributed data in distributed systems using Reed-Solomon codes.
In one embodiment, processes are provided for error correction that
include receiving a first correction code for data fragments stored
in storage nodes, constructing a second correction code responsive
to an unavailable storage node of the storage nodes, performing
erasure repair of the unavailable storage node, and outputting a
corrected data fragment. The first correction code is a
Reed-Solomon code represented as a polynomial and the second
correction code is represented as a second polynomial with an
increased subpacketization size. Processes are configured to
account for repair bandwidth and sub-packetization size. Code
constructions and repair schemes accommodate different sizes of
evaluation points and provide a flexible tradeoff between the
subpacketization size repair bandwidth of codes. In addition,
schemes are provided to manage a single node failure and multiple
node failures.
Inventors: |
WANG; Zhiying; (Irvine,
CA) ; LI; Weiqi; (Irvine, CA) ; JAFARKHANI;
Hamid; (Irvine, CA) |
|
Applicant: |
Name |
City |
State |
Country |
Type |
The Regents of the University of California |
Oakland |
CA |
US |
|
|
Assignee: |
The Regents of the University of
California
Oakland
CA
|
Family ID: |
1000005006415 |
Appl. No.: |
16/887804 |
Filed: |
May 29, 2020 |
Related U.S. Patent Documents
|
|
|
|
|
|
Application
Number |
Filing Date |
Patent Number |
|
|
62855361 |
May 31, 2019 |
|
|
|
Current U.S.
Class: |
1/1 |
Current CPC
Class: |
H03M 13/154 20130101;
H03M 13/1515 20130101; H03M 13/2906 20130101; G06F 11/1076
20130101 |
International
Class: |
H03M 13/29 20060101
H03M013/29; G06F 11/10 20060101 G06F011/10; H03M 13/15 20060101
H03M013/15 |
Claims
1. A method for error correction of distributed data including
modifying a Reed-Solomon correction code, the method comprising:
receiving, by a device, a first correction code for a plurality of
data fragments stored in a plurality of storage nodes, wherein the
first correction code is a Reed-Solomon code having a data symbol
for each the plurality of storage nodes and wherein the first
correction code is represented as a first polynomial over a first
finite field having a first subpacketization size; constructing, by
the device, a second correction code in response to at least one
unavailable storage node of the plurality of storage nodes, wherein
the second correction code is represented as a second polynomial
over a second finite field, the second polynomial having an
increased subpacketization size relative to the first polynomial;
performing, by the device, erasure repair for the at least one
unavailable storage node using the second correction code, wherein
the second correction code is applied to available data fragments
of the plurality of data fragments for erasure repair, wherein the
erasure repair utilizes at least one coset of a multiplicative
group of the second finite field; and outputting, by the device, a
corrected data fragment based on the erasure repair.
2. The method of claim 1, wherein the second correction code is
constructed for single erasure repair and erasure repair utilizes
at least one of one coset and two cosets, wherein code length (n)
and dimension (k) of the first correction code are maintained.
3. The method of claim 1, wherein the second correction code is
constructed for single erasure repair and erasure repair utilizes
multiple cosets, the code length (n) of the second correction code
is increased and redundancy (r) is fixed.
4. The method of claim 1, wherein the second correction code is
constructed for single erasure repair, the second correction code
providing a scalar code with evaluation points selected from one
coset.
5. The method of claim 1, wherein the second correction code is
constructed for single erasure repair with evaluation points in
selected two cosets, and erasure repair includes selection of
correction code polynomials that have a full rank condition in a
coset of an unavailable node and rank 1 when evaluated at another
coset.
6. The method of claim 1, wherein the second correction code is
constructed for single erasure repair, wherein evaluation points
for erasure repair are chosen from multiple cosets to increase code
length.
7. The method of claim 1, wherein the second correction code is
constructed for repairing multiple erasures in the plurality of
data fragments, wherein evaluation points for erasure repair are in
at least one of one coset and multiple cosets.
8. The method of claim 1, wherein the second correction code is
constructed for repairing multiple erasures in the plurality of
data fragments, wherein erasure repair includes use of a helper
node to reconstruct at least one symbol for a first data fragment,
and wherein the reconstructed symbol is used for erasure repair of
a second data fragment.
9. The method of claim 1, wherein the performing of the erasure
repair includes using a first dual codeword and a second dual
codeword as repair polynomials, and combining a trace function with
dual codewords to generate fragments.
10. The method of claim 1, further comprising determining an
erasure repair scheme for the plurality of data fragments.
11. A device configured for error correction of distributed data
including modifying a Reed-Solomon correction code, the device
comprising: an interface configured to receive a first correction
code for a plurality of data fragments stored in a plurality of
storage nodes, wherein the correction code is a Reed-Solomon code
having a data symbol for each the plurality of storage nodes and
wherein the first correction code is represented as a first
polynomial over a first finite field having a first
subpacketization size; a repair module, coupled to the interface,
wherein the repair module is configured to: construct a second
correction code in response to at least one unavailable storage
node of the plurality of storage nodes, wherein the second
correction code is represented as a second polynomial over a second
finite field, the second polynomial having an increased
subpacketization size relative to the first polynomial; perform
erasure repair for the at least one unavailable storage node using
the second correction code, wherein the second correction code is
applied to available data fragments of the plurality of data
fragments for erasure repair, wherein the erasure repair utilizes
at least one coset of a multiplicative group of the second finite
field; and output a corrected data fragment based on the erasure
repair.
12. The device of claim 11, wherein the second correction code is
constructed for single erasure repair and erasure repair utilizes
at least one of one coset and two cosets, wherein code length (n)
and dimension (k) of the first correction code are maintained.
13. The device of claim 11, wherein the second correction code is
constructed for single erasure repair and erasure repair utilizes
multiple cosets, the code length (n) of the second correction code
is increased and redundancy (r) is fixed.
14. The device of claim 11, wherein the second correction code is
constructed for single erasure repair, the second correction code
providing a scalar code with evaluation points selected from one
coset.
15. The device of claim 11, wherein the second correction code is
constructed for single erasure repair with evaluation points in
selected two cosets, and erasure repair includes selection of
correction code polynomials that have a full rank condition in a
coset of an unavailable node and rank 1 when evaluated at another
coset.
16. The device of claim 11, wherein the second correction code is
constructed for single erasure repair, wherein evaluation points
for erasure repair are chosen from multiple cosets to increase code
length.
17. The device of claim 11, wherein the second correction code is
constructed for repairing multiple erasures in the plurality of
data fragments, wherein evaluation points for erasure repair are in
at least one of one coset and multiple cosets.
18. The device of claim 11, wherein the second correction code is
constructed for repairing multiple erasures in the plurality of
data fragments, wherein erasure repair includes use of a helper
node to reconstruct at least one symbol for a first data fragment,
and wherein the reconstructed symbol is used for erasure repair of
a second data fragment.
19. The device of claim 11, wherein the performing of the erasure
repair includes using a first dual codeword and a second dual
codeword as repair polynomials, and combining a trace function with
dual codewords to generate fragments.
20. The device of claim 11, further comprising determining an
erasure repair scheme for the plurality of data fragments.
Description
CROSS REFERENCE TO RELATED APPLICATIONS
[0001] This application claims priority to U.S. Provisional
Application No. 62/855,361 titled METHOD AND SYSTEM FOR REPAIRING
REED-SOLOMON CODES filed on May 31, 2019, the content of which is
expressly incorporated by reference in its entirety.
FIELD
[0002] The present disclosure generally relates to methods for
determining error correcting codes for storage systems, distributed
storage, and repairing erasures.
BACKGROUND
[0003] Reed-Solomon (RS) codes are very popular in distributed
systems because they provide efficient implementation and high
failure-correction capability for a given redundancy level. RS
Codes have been used in systems such as Googles Colossus, Quantcast
File System, Facebooks f4, Yahoo Object Store, Baidus Atlas,
Backblazes Vaults, and Hadoop Distributed File System. However, the
transmission traffic for repair failures is a huge problem for the
application of RS codes.
[0004] Existing repair processes for RS codes, such as Shanmugam et
al. considered the repair of scalar codes for the first time.
Guruswami and Wootters (2017) proposed a repair scheme for RS
codes. For an RS code with length n and dimension k over the field
F, it achieves the repair bandwidth of n-1 symbols over GF(q). Dau
and Milenkovic (2017) improved the scheme using full-length codes.
Ye and Barg (2016) proposed a scheme that asymptotically approaches
the MSR (minimum storage regenerating) bandwidth lower bound of
l(n-1)/(n-k) where the sub-packetization size is 1=(n-k)n. Tamo et
al. (2017) provided an RS code repair scheme achieving the MSR
bound and the sub-packetization size is approximately n.sup.n.
These two schemes are called MSR schemes.
[0005] The repair problem for RS codes can also be generalized to
multiple erasures. In this case, the schemes in Dau et al. (2018)
and Mardia et al. (2018) work for the full-length code, and Ye et
al. (2017) proposed a scheme achieving the multiple-erasure MSR
bound. The full-length RS code has high repair bandwidth, and the
MSR-achieving RS code has large sub-packetization.
[0006] A flexible tradeoff between the sub-packetization size and
the repair bandwidth is an open problem of previous schemes. Only
the full-length RS code with high repair bandwidth and the
MSR-achieving RS code with large sub-packetization are established.
There is a need for providing more points between the two extremes
the full-length code and the MSR code. One straightforward method
is to apply the schemes to the case of l>log.sub.q n with fixed
(n, k). However, the resulting normalized repair bandwidth grows
with l, contradictory to the idea that a larger l implies smaller
normalized bandwidth.
[0007] There is a desire for improvements in repair bandwidths and
processes and configurations for RS code repair.
BRIEF SUMMARY OF THE EMBODIMENTS
[0008] Disclosed and claimed herein are systems and methods for
efficient repair of Reed-Solomon codes. In one embodiment, a method
for error correction of distributed data includes receiving, by a
device, a first correction code for a plurality of data fragments.
These fragments are stored in a plurality of storage nodes. The
first correction code is a Reed-Solomon code having a data symbol
for each the plurality of storage nodes. Moreover, this correction
code is represented as a first polynomial over a first finite field
and has a first subpacketization size. The method also includes
constructing, by the device, a second correction code in response
to at least one unavailable storage node of the plurality of
storage nodes. The second correction code is represented as a
second polynomial over a second finite field and has an increased
subpacketization size relative to the first polynomial. The method
includes performing, by the device, erasure repair for the at least
one unavailable storage node using the second correction code. The
second correction code is applied to available data fragments of
the plurality of data fragments for erasure repair. In one
embodiment, the erasure repair utilizes at least one coset of a
multiplicative group of the second finite field. The method also
includes outputting, by the device, a corrected data fragment based
on the erasure repair.
[0009] In one embodiment, the second correction code is constructed
is for single erasure repair. The erasure repair utilizes at least
one of one coset and two cosets. In this embodiment, the code
length (n) and dimension (k) of the first correction code are
maintained.
[0010] In one embodiment, the second correction code is constructed
for single erasure repair. The erasure repair utilizes multiple
cosets. In this embodiment, the code length (n) of the second
correction code is increased and redundancy (r) is fixed.
[0011] In one embodiment, the second correction code is constructed
for single erasure repair. In this embodiment, the second
correction code provides a scalar code with evaluation points
selected from one coset.
[0012] In one embodiment, the second correction code is constructed
for single erasure repair with evaluation points in selected two
cosets. In this embodiment, the erasure repair includes selection
of correction code polynomials that have a full rank condition in a
coset of an unavailable node and rank 1 when evaluated at another
coset.
[0013] In one embodiment, the second correction code is constructed
for single erasure repair. In this embodiment, the evaluation
points for erasure repair are chosen from multiple cosets to
increase code length.
[0014] In one embodiment, the second correction code is constructed
for repairing multiple erasures in the plurality of data fragments.
In this embodiment, the erasure repair includes use of a helper
node to reconstruct at least one symbol for a first data fragment.
Moreover, in this embodiment, the reconstructed symbol is used for
erasure error of a second data fragment.
[0015] In one embodiment, the erasure repair includes using a first
dual codeword and a second dual codeword as repair polynomials.
Additionally, this embodiment includes combining a trace function
with dual codewords in order to generate fragments.
[0016] In one embodiment, the method for error correction of
distributed data as described above further includes determining an
erasure repair scheme for the plurality of data fragments.
[0017] Another embodiment is directed to a device configured for
error correction of distributed data that includes modifying a
Reed-Solomon correction code. The device includes an interface
configured to receive a first correction code for a plurality of
data fragments stored in a plurality of storage nodes. In this
embodiment, the correction code is a Reed-Solomon code having a
data symbol for each the plurality of storage nodes. Moreover, the
first correction code is represented as a first polynomial over a
first finite field and has a first subpacketization size. The
device further includes a repair module, coupled to the interface.
The repair module is configured to construct a second correction
code in response to at least one unavailable storage node of the
plurality of storage nodes. In this embodiment, the second
correction code is a second polynomial representation over a second
finite field. The second polynomial representation has an increased
subpacketization size relative to the first polynomial. The repair
module is further configured to perform erasure repair for the at
least one unavailable storage node using the second correction
code. The second correction code is applied to available data
fragments of the plurality of data fragments for erasure repair.
The erasure repair utilizes at least one coset of a multiplicative
group of the second finite field. In this embodiment, the repair
module is further configured to output a corrected data fragment
based on the erasure repair.
BRIEF DESCRIPTION OF THE DRAWINGS
[0018] The features, objects, and advantages of the present
disclosure will become more apparent from the detailed description
set forth below when taken in conjunction with the drawings in
which like reference characters identify correspondingly throughout
and wherein:
[0019] FIG. 1 illustrates an exemplary network architecture for
performing error correction of distributed data according to one or
more embodiments;
[0020] FIG. 2 depicts a process for error correction of distributed
data according to one or more embodiments;
[0021] FIG. 3 depicts a diagram of a computing device that may be
configured for error correction of distributed data according to
one or more embodiments;
[0022] FIG. 4 illustrates a graphical representation showing
normalized repair bandwidth values that correspond to various
subpacketization sizes as a result of implementing different repair
schemes according to one or more embodiments;
[0023] FIG. 5 illustrates a graph representation showing normalized
repair bandwidth values that correspond to various subpacketization
sizes as a result of implementing different repair schemes
according to one or more embodiments;
[0024] FIG. 6 shows erasure locations in various data packets
according to one or more embodiments;
[0025] FIG. 7 illustrates a graph representation showing normalized
repair bandwidth values that correspond to various subpacketization
sizes as a result of implementing different repair schemes in
accordance with one or more embodiments; and
[0026] FIG. 8 illustrates a process for error correction of
distributed data according to one or more embodiments that includes
modifying a Reed-Solomon correction code.
DETAILED DESCRIPTION OF THE EXEMPLARY EMBODIMENTS
Overview and Terminology
[0027] One aspect of the disclosure is directed to determining
error correcting codes for storage systems, distributed storage,
and repairing erasure.
[0028] In distributed storage, storage nodes may be unavailable for
data access due to node failure or temporary high workload. To
alleviate this issue, erasure correcting codes are applied to a
plurality of storage nodes for a plurality of data fragments. Each
storage node may correspond to one symbol of the erasure correcting
code (e.g., codeword). Reed-Solomon (RS) codes are one of the most
widely used codes. In particular, for $k$ information nodes, $n-k$
redundant parities nodes are added and RS codes can tolerate $n-k$
unavailable nodes. However, when node unavailability occurs, a lot
of communication traffic is incurred in order to repair the
information. Schemes that require the traffic equal to the content
of $k$ storage nodes are prohibitive in large-scale storage
systems. Embodiments herein provide methods/algorithms to implement
the repair of the RS codes that require less transmission (i.e.,
less repair bandwidth) compared to existing repair codes. Moreover,
methods described here achieve low complexity for a given bandwidth
since sub-packetization size is small, and account for bandwidth
considerations.
[0029] In distributed storage, every code word symbol corresponds
to a storage node, and communication costs between storage nodes
need to be considered when node failures are repaired. The repair
bandwidth is defined as the amount of transmission required to
repair a single node erasure, or failure, from the remaining nodes
(called helper nodes). For an RS code over the finite field
F=GF(q.sup.l), the sub-packetization size of the code is 1. Here q
is a prime number, and the finite field may relate to a Galois
field. As used herein erasure may include one or more of lost,
corrupt and unavailable code fragments. Small repair bandwidth is
desirable to reduce the network traffic in distributed storage.
Small sub-packetization is attractive due to the complexity in
field arithmetic operations. Embodiments discussed herein provide
constructions and repair algorithms of RS codes to provide a
flexible tradeoff between the repair bandwidth and the
sub-packetization size.
[0030] Erasure codes are ubiquitous in distributed storage systems
because they can efficiently store data while protecting against
failures. Reed-Solomon (RS) code is one of the most commonly used
codes because it achieves the Singleton bound and has efficient
encoding and decoding methods. Codes matching the Singleton bound
are called maximum distance separable (MDS) codes, and they have
the highest possible failure-correction capability for a given
redundancy level. In distributed storage, every code word symbol
corresponds to a storage node, and communication costs between
storage nodes need to be considered when node failures are
repaired. Repair bandwidth of RS codes, may be defined as the
amount of transmission required to repair a single node erasure, or
failure, from all the remaining nodes (called helper nodes). For a
given erasure code, when each node corresponds to a single finite
field symbol over F=GF(q.sup.l), the code is said to be scalar;
when each node is a vector of finite field symbols in B=GF(q) of
length l, it is called a vector code or an array code. In both
cases, we say the sub-packetization size of the code is l. Here q
is a power of a prime number. Compared to considering the repair of
scalar codes for the first time and a recent repair scheme for RS
codes where the key idea is that: rather than directly using the
helper nodes as symbols over F to repair the failed node, one
treats them as vectors over the subfield B. Thus, a helper may
transmit less than l symbols over B, resulting in a reduced
bandwidth. For an RS code with length n and dimension k over the
field F, denoted by RS(n, k), a prior approach achieves a repair
bandwidth of n1 symbols over B. Moreover, when n=q.sup.l (called
the full-length RS code) and n k=q.sup.l-1 the scheme provides the
optimal repair bandwidth. An improved scheme such that the repair
bandwidth is optimal for the full-length RS code and any n
k=q.sup.s, 1.ltoreq.s.ltoreq.log(n k). In one embodiment, the help
nodes are treated as vectors over a subfileld B, in that the helper
stores an element x.di-elect cons.F. Any element x.di-elect cons.F
can be presented as x=x.sub.0.beta..sub.0+x.sub.1.beta..sub.1+ . .
. +x.sub.l-1.beta..sub.l-1 with x.sub.0, x.sub.1, . . . , x.sub.l-1
.di-elect cons.B and a set of basis {.beta..sub.0,.beta..sub.1, . .
. , .beta..sub.l-1} for F over B. For example, x.di-elect
cons.F=GF(2.sup.2)={0,1, .alpha.,.alpha..sup.2} can be presented as
x=x.sub.0.beta..sub.0+x.sub.1.beta..sub.1 with x.sub.0, x.sub.1
.di-elect cons.B=GF(2)={0,1} and the basis {.beta..sub.0=1,
.beta..sub.1=.alpha.}. Here F is extended from B by a that is a
root of the polynomial x.sup.2=1+x. Elements included in the
subfield B may be the elements in the Galois field B=GF(q). Here
are some examples. If q=2, the subfield B=GF(2)={0,1}. If q is a
prime number, the subfield is B=GF(q)={0,1, . . . , q-1}. If q=4,
the subfield is B=GF(2.sup.2) {0,1, .alpha.,.alpha..sup.2}. The
relationship between the subfield B and the finite field F may be
that F is the Galois field extended from B by a monic irreducible
polynomial of degree 1. All elements of B are contained in F. See
the previous example of F=GF(2.sup.2)={0,1, .alpha.,.alpha..sup.2}
and B=GF(2)={0,1}.
[0031] For the full-length RS code, some schemes are optimal for
single erasure. However, the repair bandwidth of these schemes
still has a big gap from the minimum storage regenerating (MSR)
bound. In particular, for an arbitrary MDS code, the repair
bandwidth b, measured in the number of symbols over GF(q), is lower
bounded by
b .gtoreq. ( n - 1 ) n - k . ( 1 ) ##EQU00001##
[0032] An MDS code satisfying the above bound is called an MSR
code. In fact, most known MSR codes are vector codes. For the
repair of RS codes, proposed schemes that asymptotically approach
the MSR bound as n grows when the sub-packetization size is =(n
k).sub.n. A technique provided an RS code repair scheme achieving
the MSR bound when the sub-packetization size is l=n.sup.n.
[0033] The repair problem for RS codes can also be generalized to
multiple erasures. In this case, the schemes work for the
full-length code and for centralized repair. According to one
embodiment, a scheme is provided for achieving the multiple-erasure
MSR bound.
[0034] The need for small repair bandwidth is motivated by reducing
the network traffic in distributed storage, and the need for the
small sub-packetization is due to the complexity in field
arithmetic operations. It is demonstrated that the time complexity
of multiplications in larger fields are much higher than that of
smaller fields. Moreover, multiplication in Galois fields are
usually done by pre-computed look-up tables and the growing field
size has a significant impact on the space complexity of
multiplication operations. Larger fields require huge memories for
the look-up table. For example, in GF (2.sup.16), 8 GB are required
for the complete table, which is impractical in most current
systems. Some logarithm tables and sub-tables are used to alleviate
the memory problems for large fields, while increasing the time
complexity at the same time. For example, in the Intel SIMD
methods, multiplications over GF (2.sup.16) need twice the amount
of operations as over GF (2.sup.8), and multiplications over GF
(2.sup.32) need 4 times the amount of operations compared to GF
(2.sup.8), which causes the multiplication speed to drop
significantly when the field size grows.
[0035] To illustrate the impact of the sub-packetization size on
the complexity, an encoding example is provided. To encode a single
parity check node, k multiplications and k additions need to be
performed over GF(q.sup.l). For a given systematic RS (n, k) code
over GF(q.sup.l), we can encode kl log.sub.2 q bits of information
by multiplications of (n k)kl log.sub.2 q bits and additions of (n
k)kl log.sub.2 q bits. When M bits are encoded into RS (n k) codes,
we need M/(kl log.sub.2 q) copies of the code and we need
multiplications of M(n k) bits and additions of M(n k) bits in
GF(q.sup.l) in total. Although the total amount of bits we need to
multiply is independent of l, the complexity over a larger field is
higher in both time and space. For a simulation of the RS code
speed using different field sizes on different platforms, RS codes
may have faster implementation in both encoding and decoding for
smaller fields.
[0036] Besides complexity, the small sub-packetization level also
has many advantages such as easy system implementation, great
flexibility and bandwidth-efficient access to missing small files,
which makes it important in distributed storage applications.
[0037] As can be seen from the two extremes, a small
sub-packetization level also means higher costs in repair
bandwidth, and not many other codes are known besides the extremes.
For vector codes, existing systems may provide small
sub-packetization codes with small repair bandwidth, but only for
single erasure or also present a tradeoff between the
sub-packetization level and the repair bandwidth for the proposed
HashTag codes implemented in Hadoop. For scalar codes, an MSR
scheme has been extended to a smaller sub-packetization size, but
it only works for certain redundancy r and single erasure.
[0038] Embodiments herein are directed to a design for three
single-erasure RS repair schemes, using the cosets of the
multiplicative group of the finite field F. Note that the RS code
can be viewed as n evaluations of a polynomial over F. The
evaluation points of the three schemes are part of one coset, of
two cosets, and of multiple cosets, respectively, so that the
evaluation point size can vary from a very small number to the
whole field size. In the schemes designed in this paper, parameter
is provided that can be tuned, and provides a tradeoff between the
sub-packetization size and the repair bandwidth.
[0039] According to one embodiment, a first scheme is provided for
an RS (n, k) code that achieves the repair bandwidth
a ( n - 1 ) ( a - s ) ##EQU00002##
for some a, s such that n>q.sup.a, rn-k>q* and a divides l.
Specifically, for the RS (14, 10) code, we achieve repair bandwidth
of 52 bits with l=8, which is 35% better than the naive repair
scheme.
[0040] According to another embodiment, a second scheme reaches the
repair bandwidth of
( n - 1 ) + a 2 ##EQU00003##
for some such that n.gtoreq.2(q.sup.a-1), a divides l and
r < a . ##EQU00004##
[0041] According to another embodiment, a third scheme attains the
repair bandwidth of
r ( n + 1 + ( r - 1 ) ( q a - 2 ) ) when n .ltoreq. ( q a - 1 ) log
r a . ##EQU00005##
Another realization of the third scheme attains the repair
bandwidth of
r ( n - 1 + ( r - 1 ) ( q a - 2 ) ) ##EQU00006##
where
.apprxeq. a ( n q a - 1 ) ( n q a - 1 ) . ##EQU00007##
The second realization can also be generalized to any d helpers,
for k.gtoreq.d.gtoreq.n-1.
[0042] Embodiments provide characterizations of linear
multiple-erasure re-pair schemes, and propose two schemes for
multiple erasures, where the evaluation points are in one coset and
in multiple cosets, respectively. Again, the parameter a is
tunable. According to one embodiment, a is a parameter that may be
selected to help reduce the repair bandwidth for an RS(n, k) over
F. As long as the parameter a satisfies the condition in paragraph
[0041], a repair scheme can be provided to reach the repair
bandwidth in paragprah [0041]. For example, we can choose a=4 and
get repair bandwidth of 52 bits for RS(14,10) over GF(2.sup.8).
[0043] According to one aspect, proof is provided that any linear
repair scheme for multiple erasures in a scalar MDS code is
equivalent to finding a set of dual codewords satisfying certain
rank constraints.
[0044] For an RS (n, k) code with
e < 1 a - s log q n ##EQU00008##
erasures, our first scheme achieves the repair bandwidth
e a ( n - e ) ( a - s ) ##EQU00009##
for some a, s such that n<q.sup.a, r=n-k>q.sup.s and a
divides l.
[0045] For an RS (n, k) code, our second scheme works for
e.ltoreq.n-k erasures and n e helpers. The repair bandwidth depends
on the location of the erasures and in most cases, we achieve
e d - k + e ( n - e + ( n - k + e ) ( q a - 2 ) ) where .apprxeq. a
( n q a - 1 ) ( n q a - 1 ) ##EQU00010##
and a divides l. We demonstrate that repairing multiple erasures
simultaneously is advantageous compared to repairing single
erasures separately.
[0046] According to one aspect of the disclosure, processes and
configurations are provided for less repair bandwidth, and less
complexity for a given bandwidth. These processes may account for
the tradeoff between complexity and repair bandwidth. The processes
apply to any RS code parameters.
[0047] As used herein, the terms a or an shall mean one or more
than one. The term plurality shall mean two or more than two. The
term another is defined as a second or more. The terms including
and/or having are open ended (e.g., comprising). The term or as
used herein is to be interpreted as inclusive or meaning any one or
any combination. Therefore, A, B or C means any of the following:
A; B; C; A and B; A and C; B and C; A, B and C. An exception to
this definition will occur only when a combination of elements,
functions, steps or acts are in some way inherently mutually
exclusive.
[0048] Reference throughout this document to one embodiment,
certain embodiments, an embodiment, or similar term means that a
particular feature, structure, or characteristic described in
connection with the embodiment is included in at least one
embodiment. Thus, the appearances of such phrases in various places
throughout this specification are not necessarily all referring to
the same embodiment. Furthermore, the particular features,
structures, or characteristics may be combined in any suitable
manner on one or more embodiments without limitation.
Exemplary Embodiments
[0049] FIG. 1 illustrates an exemplary network architecture
according to one embodiment for error correction of distributed
data. This exemplary network architecture may include efficient
repair of Reed-Solomon codes. Storage in a distributed storage
system (DSS) often requires erasure encoding. System 100 relates an
exemplary distributed storage system. As discussed herein, systems
and processes are provided for determining repair codes for one or
more network applications. As shown in FIG. 1, system 100 includes
a storage controller 105, plurality of storage arrays 110.sub.1-n,
client devices 115.sub.1-n, and repair device 120. System 100 can
includes a plurality of components and/or devices relative to a
communication network 125. System 100 may be configured to provide
one or more applications for more or more network services and
network enterprises. Applications and enterprises may be executed
by client devices 115.sub.1-n and storage controller 105 may be
configured to store data for the enterprises and applications to
one or more storage arrays 110.sub.1-n. According to one
embodiment, repair device 120 maybe configured to determine repair
codes for data stored by system 100. According to another
embodiment, one or elements of system 100 may include functional
modules and/or components for determining erasure codes.
[0050] Elements of FIG. 1 are exemplary and system 100 may include
one or more other components of elements including one or more
servers, firewalls, routers, switches, security appliances,
antivirus servers, or other useful network devices, along with
appropriate software.
[0051] In distributed storage, storage nodes, such as storage
arrays 110.sub.1-n may be unavailable for data access due to node
failure or temporary high workload. To alleviate this issue,
erasure correcting codes are applied and each storage node
corresponds to one symbol of the codeword.
[0052] Embodiments described herein are directed to processes using
Reed-Solomon (RS) codes for storage. Erasure encoding may be used
for data structures mathematically transformed into n different
fragments. One or more original data structures may also be
referred to as a file or data. Operations described herein may be
directed to one or more data structures and file types. Coded
fragments or fragments can include any suitable piece of a file
from which the full file may be reconstructed (alone or in
conjunction with other fragments), including the formal pieces of a
file yielded by the erasure coding technique. Fragments may be
referred to as (n,k) Coding. For example in the case where n is 5
and k is 4, the original data structure might be stored with one
fragment on each of 5 storage nodes, and if any one of the storage
nodes fail, it is possible to reconstruct the original file from
any remaining fragments. When a node failure occurs, it may be
desirable to reconstruct the erasure encoded file, so that once
again the full n fragments are available for redundancy.
[0053] A plurality of repair schemes are provided. The comparison
of schemes described herein, as well as the comparison to previous
works, are shown in Tables I and II, and are discussed in more
details below.
[0054] FIG. 2 depicts a process 200 for error correction of
distributed data based on modifying a Reed-Solomon correction code
according to one or more embodiments. Process 200 may be employed
by a device, such as repair device (e.g., repair device 120) of a
system (e.g., system 100) and one or more other components.
According to one embodiment, process 200 may be initiated upon a
device (e.g., repair device 120) receiving a first correction code
at block 205. The device may receive the first correction code from
one or more client devices (e.g., one or more of client devices
115.sub.1-n) via the communication network 125. In one embodiment,
the first correction code is received for a plurality of data
fragments that are stored in a plurality of storage nodes.
Additionally, in one embodiment, the first correction code is a
Reed-Solomon code having a data symbol for each of the plurality of
storage nodes. The first correction code is represented as a
polynomial over a first finite field and has a first
subpacketization size. In one embodiment, a finite field has a
property such that calculations performed using one or more
elements of the field always results in an element with a value
that is within the field.
[0055] At block 210, process 200 includes constructing a second
correction code in response to at least one unavailable storage
node of the plurality of storage nodes and/or unavailability one or
more data fragments. The second correction code may be represented
as a second polynomial over a second finite field and has an
increased subpacketization size relative to the first polynomial.
In one embodiment, the second correction code is a new Reed-Solomon
code constructed using the first correction code. In one
embodiment, the unavailability of the storage node may be due to an
erasure error in one of the plurality of data fragments stored in
the plurality of storage nodes. An erasure error may occur if a
storage location of a particular symbol included in one of the data
fragments stored in the storage nodes is unknown. In another
embodiment, the second correction code is constructed for single
erasure repair. In one example of the construction of the second
correction code for single erasure repair, the erasure repair
utilizes at least one of one coset and two cosets. In this example,
the code length (n) and dimension (k) of the first correction code
are maintained. In another example of the construction of the
second correction code for single erasure repair, the second
correction utilizes multiple cosets. In this example, the code
length (n) of the second correction code is increased and
redundancy (r) is fixed. In another example of the construction of
the second correction code for single erasure repair, the second
correction code provides a scalar code with evaluation points
selected from one coset.
[0056] According to another example of the construction of the
second correction code for single erasure repair, the second
correction code is constructed with evaluation points that are
selected in two cosets. In this example, the erasure repair
includes selection of correction code polynomials that have a full
rank condition in a coset of an unavailable node and a rank 1 when
evaluated at another coset. In another example of the construction
of the second correction code for single erasure repair, evaluation
points for erasure repair are chosen from multiple cosets to
increase code length.
[0057] In another embodiment, the second correction code is
constructed for repairing multiple erasures in the plurality of
fragments. In one example of the construction of the second
correction code for multiple erasure repair, the erasure repair
includes use of a helper node to reconstruct at least one symbol
for a first data fragment. In this example, the reconstructed
symbol is used for erasure repair of a second fragment. The second
correction code (e.g., a new Reed-Solomon code) can be viewed as n
evaluations of a polynomial over F. These evaluation points are
part of one coset, of two cosets, and of multiple cosets,
respectively. As such, the evaluation point size can vary from a
very small number to the whole field size.
[0058] At block 214, process 200 may optionally determined an
erasure repair scheme for the plurality of data fragments. In one
embodiment, the determined erasure scheme may be used to repair a
single erasure or multiple erasures in using the constructed second
correction code.
[0059] At block 215, erasure error of the unavailable storage node
is performed using the second correction code. In one embodiment,
the second correction code is applied to available data fragments
of the plurality of data fragments for erasure repair. The erasure
error utilizes at least one coset of a multiplicative group of the
second finite field. In one embodiment, repairing the erasure
includes using a first dual codeword and a second dual codeword as
repair polynomials. In this embodiment, a trace function is
combined with the dual code words to generate fragments. In one
embodiment, the trace function is used to obtain subfield symbols,
as described in the Preliminaries section below. The use of the
trace function in combination with dual codewords is also discussed
in the Reed-Solomon Repair Schemes for Multiple Erasures and REPAIR
ALGORITHM FOR RS (14, 10) CODE sections.
[0060] According to one embodiment, erasure repair at block 215
includes a first scheme is provided for an RS (n, k) code that
achieves the repair bandwidth
a ( n - 1 ) ( a - s ) ##EQU00011##
for some a, s such that n<q.sup.an-k>q.sup.s and a divides l.
Specifically, for the RS (14, 10) code, we achieve repair bandwidth
of 52 bits with l=8, which is 35% better than the naive repair
scheme.
[0061] According to another embodiment, erasure repair at block 215
includes a second scheme that reaches the repair bandwidth of
( n - 1 ) + a 2 ##EQU00012##
for some a such that n.ltoreq.2(q.sup.a-1), a divides l and
r < a . ##EQU00013##
[0062] According to another embodiment, erasure repair at block 215
includes a third scheme that attains the repair bandwidth of
r ( n + 1 + ( r - 1 ) ( q a - 2 ) ) ##EQU00014##
when
n .ltoreq. ( q a - 1 ) log r a . ##EQU00015##
Another realization of the third scheme attains the repair
bandwidth of
r ( n - 1 - ( r - 1 ) ( q a - 2 ) ) ##EQU00016##
where
.apprxeq. a ( n q a - 1 ) ( n q a - 1 ) . ##EQU00017##
The second realization can also be generalized to any d helpers,
for k.ltoreq.d.ltoreq.n-1.
[0063] Embodiments provide characterizations of linear
multiple-erasure re-pair schemes, and propose two schemes for
multiple erasures, where the evaluation points are in one coset and
in multiple cosets, respectively. Again, the parameter a is
tunable.
[0064] Process 200 may provide linear repair of RS codes. As used
herein, for positive integer i, we use [i] to denote the set {1, 2,
. . . , i}. For integers a, b, we use a I b to denote that a
divides b. For real numbers an, bn, which are functions of n, we
use a.apprxeq.b to denote
lim n -> .infin. a n b n = 1. ##EQU00018##
For sets AB, we use B/A to denote the difference of A from B. For a
finite field F we denote by ={0} the corresponding multiplicative
group. We write .ltoreq. for being a subfield of . For element
.beta..di-elect cons.F and E as a subset of F, we denote
.beta.E={.beta.s, .A-inverted.s.di-elect cons.E}. A.sup.T denotes
the transpose of the matrix A.
[0065] At block 216, process 200 includes outputting a corrected
data based on the erasure repair that is performed in block 215. In
one embodiment, errors in one or more of the plurality of data
fragments are corrected and the data fragments are restored in
their entirety such that information associated with all data
fragments stored in the storage nodes are accessible and known.
Output of data can include output of a corrected fragment.
[0066] According to one embodiment, erasure correcting codes in
erasure schemes determined by process 200 are applied with respect
to each storage node corresponding to one symbol of the codeword.
Similarly, erasure correcting codes determined by process 200 may
be for storage systems, distributed storage, and repairing
erasures. In one embodiment, process 200 also includes a parameter
that can be tuned, and provides a tradeoff between the
sub-packetization size and the repair bandwidth.
[0067] FIG. 3 depicts a diagram of a computing device 300 that may
be configured for error correction of distributed data based on
modifying a Reed-Solomon correction code according to one or more
embodiments. Unit 300 includes processor 305, memory 310, and
input/output interface 315. In some embodiments, unit 300 may also
include a repair module 320. In certain embodiments, repair module
may relate to functional element of processor 305. Unit 300 may be
configured to receive data from one or more devices.
[0068] Processor 305 may be configured to provide one or more
repair functions, including determine one or more of a single
erasure, and multiple erasures. According to one embodiment,
processor 305 is configured to perform one or more operations.
Memory 310 may include ROM and RAM memory for operation of unit 300
and processor 305. Input/output interface 315 may include one or
more inputs or controls for receiving and providing data.
[0069] In one embodiment, input/output interface 315 may be
configured to receive a first correction code for a plurality of
data fragments that are stored in a plurality of storage nodes. In
one embodiment, the first correction code is a Reed-Solomon code
that has a data symbol for each of the plurality of storage nodes
and is represented as a polynomial over a first finite field with a
first subpacketization size. As stated above, in one embodiment, a
finite field has a property such that calculations performed using
one or more elements of the field always result in an element
having a value that is within the field.
[0070] According to one embodiment, the repair module 320 may be
coupled to the input/output interface 315. In this embodiment, the
repair module 320 may be configured to construct a second
correction code in response to at least one unavailable storage
node of the plurality of storage nodes. The second correction code
is represented as a second polynomial over a second finite field
and has an increased subpacketization size relative to the first
polynomial. In one embodiment, the second correction code is a new
Reed-Solomon code. As stated above, the unavailability may be due
to an erasure error in one of the plurality of data fragments. An
erasure error may occur if a storage location of a particular
symbol included in one of the data fragments is unknown. The repair
module 320 may construct a second correction code using one or more
operations discussed relative to process 200 of FIG. 2.
[0071] In one embodiment, the repair module 320 may be configured
to perform erasure repair in the unavailable storage node using the
second correction code. In one embodiment, the second correction
code is applied to available data fragments of the plurality of
data fragments for erasure repair. The erasure error utilizes at
least one coset of a multiplicative group of the second finite
field. The repair module 320 may perform erasure repair using one
or more operations discussed relative to process 200 of FIG. 2. In
yet another embodiment, repair module 320 may be configured to
output a corrected data fragment based on the erasure repair. In
one embodiment, as stated above, errors in one or more of the
plurality of data fragments are corrected and the data fragments
are restored in their entirety such that all data fragments stored
in the storage nodes are accessible and known.
Preliminaries
[0072] In this section, a review of linear repair scheme of RS code
and a basic lemma used in our proposed schemes is provided.
[0073] The Reed-Solomon code RS (A, k) over =GF() of dimension k
with n evaluation points A={.alpha.1, .alpha.2, . . . , .alpha.n}F
is defined as
RS(A,k)={(f(.alpha..sub.1),f(.alpha..sub.2), . . .
f(.alpha..sub.n)): f.di-elect cons.[x],deg(f).ltoreq.k-1},
where deg( )denotes the degree of a polynomial,
f(x)=u.sub.0+u.sub.1x+u.sub.2x.sup.2+ . . . +u.sub.k-1x.sup.k-1,
and, u.sub.i.di-elect cons., i=0, 1, . . . , k1 are the messages.
Every evaluation symbol f(.alpha.), .alpha..di-elect cons.A, is
called a code word symbol or a storage node. The sub-packetization
size is defined as r, and rn-k denotes the number of parity
symbols.
[0074] Assume e nodes fail, e.ltoreq.nk, and we want to recover
them. The number of helper nodes are denoted by d. The amount of
information transmitted from the helper nodes is defined as the
repair bandwidth b, measured in the number of symbols over GF(q).
All the remaining ne=d nodes are assumed to be the helper nodes
unless stated otherwise. We define the normalized repair bandwidth
as
b d , ##EQU00019##
which is the average fraction of information transmitted from each
the minimum storage regenerating (MSR) bound for the bandwidth
is
b .gtoreq. e d d - k + e . ( 2 ) ##EQU00020##
[0075] As mentioned before, codes achieving the MSR bound require
large sub-packetization sizes. In this section, we focus on the
single erasure case.
[0076] Assume B.ltoreq.F, namely, B is a subfield of F. A linear
repair scheme requires some symbols of the subfield B to be
transmitted from each helper node. If the symbols from the same
helper node are linearly dependent, the repair bandwidth decreases.
In particular, the scheme uses dual code to compute the failed node
and uses trace function to obtain the transmitted subfield symbols,
as detailed below. Assume f(.alpha.*) fails for some
.alpha.*.di-elect cons.A. For any polynomial p(x).di-elect
cons.F[x] of which the degree is smaller than r, (.nu.1p(.alpha.1),
.nu.2p(.alpha.2), . . . , .nu.np(.alpha.n)) is a dual codeword of
RS (A, k), where .nu.i, i.di-elect cons.[n] are non-zero constants
determined by the set A. We can thus repair the failed node
f(.alpha.*) from
.upsilon. .alpha. * p ( .alpha. * ) f ( .alpha. * ) = - i = 1 ,
.alpha. i .noteq. .alpha. * n v i p ( .alpha. i ) f ( .alpha. i ) (
3 ) ##EQU00021##
[0077] The summation on the right side means that we add all
the
[0078] i elements from i=1 to i=n except when
.alpha..sub.1.noteq..alpha.*.
[0079] The trace function from F onto B is defined as
(.beta.)=.beta.+.beta..sup.q+ . . . +, (4)
[0080] where .beta..di-elect cons.F, B=GF(q) is called the base
field, and q is a power of a prime number. It is a linear mapping
from F to B and satisfies
(.alpha..beta.)=(.beta.) (5)
for all .alpha..di-elect cons.B.
[0081] We define the rank rank.sub.B({.gamma.1, .gamma.2, . . . ,
.gamma.i}) to be the cardinality of a maximal subset of {.gamma.1,
.gamma.2, . . . , .gamma.i} that is linearly independent over B.
For example, for B=GF(2) and .alpha. .di-elect cons./B,rankB({1,
.alpha., 1+.alpha.})=2 because the subset {1, .alpha.} is the
maximal subset that is linearly independent over B and the
cardinality of the subset is 2.
[0082] Assume we use polynomials p.sub.j(x), j.di-elect cons.[] to
generate different dual codewords, called repair polynomials.
Combining the trace function and the dual code, we have
tr / ( v .alpha. * p j ( .alpha. * ) f ( .alpha. * ) ) = - i = 1 ,
.alpha. i .noteq. .alpha. * n tr / ( .upsilon. i p j ( .alpha. i )
f ( .alpha. i ) ) . ( 6 ) ##EQU00022##
[0083] In a repair gee e helper f(.alpha.i)transmits
{(.nu..sub.ip.sub.j(.alpha..sub.i)f(.alpha..sub.i)):j.di-elect
cons.[]}. (7)
[0084] TABLE 1 shows a comparison of different schemes for single
erasure. When a=l, the scheme is one coset. When a=1, the scheme in
multiple cosets.
TABLE-US-00001 TABLE I repair bandwith code length restrictions
Schemes in [6], (n - 1( - ) n .ltoreq. q .ltoreq. r [7] Scheme in
[16] < r ( n + 1 ) ##EQU00023## n = log, Scheme in [17] ? r ( n
- 1 ) ##EQU00024## n.sup.n .apprxeq. Our scheme in one coset
.ltoreq. r ( n - 1 ) ( a - s ) ##EQU00025## n .ltoreq. (q.sup.a -
1) q .ltoreq. r, a| Our scheme in two cosets < ( n - 1 ) ? 2
##EQU00026## n .ltoreq. 2(q.sup.a - 1) r .ltoreq. a , a
##EQU00027## Our scheme in multiple cosets 1 < r ( n + 1 + ( r -
1 ) ( q n - 2 ) ) ##EQU00028## n .ltoreq. (q.sup.a - 1)m |a =
r.sup.m for some interger m Our scheme in multiple cosets 2 r ( n -
1 + ( r - 1 ) ( q n - 2 ) ) ##EQU00029## n .ltoreq. (q.sup.a - 1)m
|a .apprxeq. m.sup.m for some interger m indicates data missing or
illegible when filed
TABLE-US-00002 TABLE II repair bandwith code length restrictions
Scheme 1 in [19] .ltoreq. ( n - e ) e - ? 2 ##EQU00030## n .ltoreq.
q q ? .ltoreq. r , ? < log q , n ##EQU00031## Scheme 2 in [19]
.ltoreq. min ? ( ( n - ? ) ( - log q ( ? 2 ? - 1 ) ) ) ##EQU00032##
n .ltoreq. q Scheme in [22] ? d - ? ##EQU00033## n.sup.n .apprxeq.
Our scheme for multiple erasures in one coset .ltoreq. ? ( n - e )
( a - s ) ##EQU00034## n .ltoreq. (q.sup.a - 1) q .ltoreq. r , a ,
e < 1 a - s log q , n ##EQU00035## Our scheme for multiple
erasures in multiple coset n - k ( n - e + ( n - k + e ) ( q - 2 )
) ##EQU00036## n .ltoreq. (q.sup.a - 1)m |a .apprxeq. m.sup.m for
some interger m indicates data missing or illegible when filed
[0085] TABLE II shows a comparison for multiple erasures. When a=l
and s=l, the scheme is one coset. When a=l, the scheme in multiple
cosets.
[0086] Supposing
{.nu..sub..alpha.*p.sub.1(.alpha.*),.nu..sub..alpha.*p.sub.2(.alpha.*),
. . . .nu..sub..alpha.*(.alpha.*)} is a basis for F over B, and
assume {.mu.1, .mu.2, . . . , .mu.'} is its dual basis. Then,
f(.alpha.*) can be repaired by
f ( .alpha. * ) = j = 1 .mu. j tr / ( .upsilon. .alpha. * p j (
.alpha. * ) f ( .alpha. * ) ) . ( 8 ) ##EQU00037##
[0087] Since .nu.a*is a non-zero constant, we equivalently suppose
that {p1(.alpha.*), . . . , p'(.alpha.*)} is a basis.
[0088] In fact, by any linear repair scheme of RS code for the
failed node f(.alpha.*) is equivalent to choosing p.sub.j(x),
j.di-elect cons.[], with degree smaller than r, such that
{p.sub.1(.alpha.*), . . . (.alpha.*)} forms a basis for F over B.
We call this the full rank condition:
({p.sub.1(.alpha.*),p.sub.2(.alpha.*), . . . ,(.alpha.*)})=,
(9)
[0089] The repair bandwidth can be calculated from (7) and by
noting that v.sub.if(.alpha..sub.i) is a constant:
b = .alpha. .di-elect cons. A , .alpha. .noteq. .alpha. * rank ( {
p 1 ( .alpha. ) , p 2 ( .alpha. ) , , p ( .alpha. ) } ) . ( 10 )
##EQU00038##
[0090] We call this the repair bandwidth condition.
[0091] The goal of a good RS code construction and its repair
scheme is to choose appropriate evaluation points A and polynomials
p.sub.j(x), j.di-elect cons.[], that can reduce the repair
bandwidth in (10)while satisfying (9).
[0092] The following lemma is due to the structure of the
multiplicative group of F, which will be used for finding text use
[0093] Lemma 1. Assume .ltoreq.=GF(), then can be partitioned
to
[0093] t = .DELTA. q - 1 - 1 ##EQU00039##
cosets: {,.beta.*, .beta..sup.2*, . . . ,.beta..sup.t-1*}, where
.beta. is a primitive element of . [0094] Proof: The -1 elements in
* are {1,.beta.,.beta..sup.2, . . . .sup.-2} and * * Assume that t
is the smallest nonzero number that satisfies .beta..sup.t.di-elect
cons.*, then we know that .beta..sup.k.di-elect cons.*if and only
if t|k. Also, .beta..sup.k.sup.1.noteq..beta..sup.k.sup.2 when
k.sub.1.noteq.k.sub.2 and k.sub.1,k.sub.2<-2. Since there are
only ||-1 nonzero distinct elements in and =1, we have
[0094] t = q - 1 - 1 ##EQU00040##
and the t cosets are *{1,.beta..sup.t,.beta..sup.2t, . . . , },
.beta.*={.beta.,.beta..sup.t+1,.beta..sup.2t+1, . . . , , . . .
.beta..sup.t-1*={.beta..sup.t-1,.beta..sup.2t-1,.beta..sup.3t+1, .
. . , }.
Reed-Solomon Repair Schemes for Single Erasure.
[0095] Three RS repair schemes for single erasure are provided.
Evaluation points are part of one coset, two cosets and multiple
cosets for a single erasure. From these constructions, embodiments
can achieve different points on the tradeoff between the
sub-packetization size and normalized repair bandwidth.
[0096] Processes may be based on schemes at employ:
i) taking an original RS code and constructing a new code over a
larger finite field--thus the sub-packetization of l is increased,
ii) for the schemes using one and two cosets, the code parameters
n, k are kept the same as the original code. Hence, for given n,
r=nk, the sub-packetization size l increases, but we show that the
normalized repair bandwidth remains the same, and iii) For the
scheme using multiple cosets, the code length n is increased and
the redundancy r is fixed. Moreover, the code length n grows faster
than the sub-packetization size l. Therefore, for fixed n, r, the
sub-packetization 1 decreases, and we show that the normalized
repair bandwidth is only slightly larger than the original
code.
A. Schemes in One Coset
[0097] Assume E=GF(q.sup.a) is a subfield of F=GF(q.sup.l) and
B=GF(q) is the base field, where q is a prime number. The
evaluation points of the code over F that we construct are part of
one coset in Lemma 1.
[0098] We first present the following lemma about the basis.
Lemma 2. Assume {.xi..sub.1,.xi..sub.2, . . . } is a basis for
=GF() over =GF(q), then
{.xi..sub.1.sup.q.sup.s,.xi..sub.2.sup.q.sup.s, . . . , },
s.di-elect cons.[] is also a basis.
[0099] Proof: Assume
{.xi..sub.1.sup.q.sup.s,.xi..sub.2.sup.q.sup.s, . . . , },
s.di-elect cons.[] is not a basis for over , then there exist
nonzero (.alpha..sub.1,.alpha..sub.2, . . . , ),
.alpha..sub.i.di-elect cons., i.di-elect cons.[], that satisfy
.alpha. 1 .xi. 1 q s + .alpha. 2 .xi. 2 q s + + .alpha. .xi. q s =
0 = ( .alpha. 1 .xi. 1 + .alpha. 2 .xi. 2 + + .alpha. .xi. ) q s ,
( 11 ) ##EQU00041##
which is in contradiction to the assumption that
{.xi..sub.1,.xi..sub.2, . . . } is a basis for over .
[0100] The following theorem shows the repair scheme using one
coset for the evaluation points.
Theorem 1. There exists an RS(n,k) code over =GF() with repair
bandwidth
b .ltoreq. a ( n - 1 ) ( a - s ) ##EQU00042##
symbols over =GF(q), where q is a prime number and a, s satisfy
n<q.sup.a,q.sup.s.ltoreq.n-k, a|.
[0101] Proof: Assume a field =GF() is extended from =GF(q.sup.a),
a|, and .beta. is a primitive element of . We focus on the code
RS(A, k) of dimension k over with evaluation points
A={.alpha..sub.1,.alpha..sub.2, . . . , .alpha..sub.n}.beta..sup.m*
for some
0 .ltoreq. m < q - 1 q .alpha. - 1 , ##EQU00043##
which is one of the cosets in Lemma 1.
[0102] The base field is B=GF(q) and (6) is used to repair the
failed node f(.alpha.*).
[0103] Construction I: For s=a1, we choose
p j ( x ) = tr / ( .xi. j ( x .beta. m - .alpha. * .beta. m ) ) x
.beta. m - .alpha. * .beta. m , j .di-elect cons. [ a ] , ( 12 )
##EQU00044##
[0104] Where {.xi.1, .xi.2, . . . , .xi.a} is a basis for E over B.
The degree of pj(x) is smaller than r since q.sup.s.ltoreq.r. When
x=.alpha.*, by (4) we have
p.sub.j(.alpha.*)=.xi..sub.j. (13)
[0105] So the polynomials satisfy
({p.sub.1(.alpha.*),p.sub.2(.alpha.*), . . .
,p.sub.a(.alpha.*)})=a. (14)
When x.noteq..alpha.*, since
tr / ( .xi. j ( x .beta. m - .alpha. * .beta. m ) ) .di-elect cons.
, and x .beta. m - .alpha. * .beta. m ##EQU00045##
is a constant independent of j, we have
({p.sub.1(x),p.sub.2(x), . . . ,p.sub.a(x)})=1. (15)
Let {.eta..sub.1,.eta..sub.2,.eta..sub.3, . . . , } be a basis for
over the repair polynomials are chosen as
{.eta..sub.1p.sub.j(x),.eta..sub.2p.sub.j(x), . . . p.sub.j(x):
j.di-elect cons.[a]} (16)
Since p.sub.j(x).di-elect cons., we can conclude that
rank ( { .eta. 1 p j ( .alpha. * ) , .eta. 2 p j ( .alpha. * ) , ,
.eta. / a p j ( .alpha. * ) : j .di-elect cons. [ a ] } ) = a rank
( { p 1 ( .alpha. * ) , p 2 ( .alpha. * ) , , p a ( .alpha. * ) } )
= ( 17 ) ##EQU00046##
satisfies the full rank condition and for x.noteq..alpha.*
rank ( { .eta. 1 p j ( x ) , .eta. 2 p j ( x ) , , .eta. / a p j (
x ) : j .di-elect cons. [ a ] } ) = a rank ( { p 1 ( x ) , p 2 ( x
) , , p a ( x ) } ) = a . ( 18 ) ##EQU00047##
From (10) we can calculate the repair bandwidth
b = a ( n - 1 ) . ( 19 ) ##EQU00048##
[0106] Construction II: For s.ltoreq.a1,
p j ( x ) = .xi. j i = 1 q * - 1 ( x .beta. m - ( .alpha. * .beta.
m - w i - 1 .xi. j ) ) , j .di-elect cons. [ a ] , ( 20 )
##EQU00049##
where {.xi..sub.1,.xi..sub.2, . . . , .xi..sub.a} is a basis for
over and W={w.sub.0=0,w.sub.1,w.sub.2, . . . ,
w.sub.q.sub.s.sub.-1} is an s-dimensional subspace in , s<a,
q.sup.s.ltoreq.r. It is easy to check that the degree of p.sub.j(x)
is smaller than r since q.sup.s.ltoreq.r. When x=.alpha.*, we
have
p j ( .alpha. * ) = .xi. j q s i = 1 q s - 1 w i - 1 . ( 21 )
##EQU00050##
[0107] Since
i = 1 q s - 1 w i - 1 ##EQU00051##
is a constant, from Lemma 2 we have
({p.sub.1(.alpha.*),p.sub.2(.alpha.*), . . .
,p.sub.a(.alpha.*)})=.alpha.. (22)
[0108] For x.noteq..alpha.*, set
x ' = .alpha. * .beta. m - x .beta. m .di-elect cons. ,
##EQU00052##
we have
p j ( x ) = .xi. j i = 1 q s - 1 ( x .beta. m - ( .alpha. * .beta.
m - w i - 1 .xi. j ) ) = .xi. j i = 1 q s - 1 ( w i - 1 .xi. j - x
' ) = .xi. j i = 1 q s - 1 ( w i - 1 x ' ) i = 1 q s - 1 ( .xi. j /
x ' - w i ) = ( x ' ) q s i = 1 q s - 1 ( w i - 1 ) i = 0 q s - 1 (
.xi. j / x ' - w i ) . ( 23 ) ##EQU00053##
[0109] Then,
g ( y ) = i = 0 q s - 1 ( y - w i ) , ##EQU00054##
is a linear mapping from E to itself with dimension as over B.
Since
( x ' ) q s i = 1 q s - 1 ( w i - 1 ) ##EQU00055##
is a constant independent of j, we have
({p.sub.1(x)p.sub.2(x), . . . p.sub.a(x)}).ltoreq.a-s. (24)
Let {.eta..sub.1,.eta..sub.2,.eta..sub.3, . . . , } be a basis for
over , then the polynomals are chosen as {.eta..sub.1p.sub.j(x),
.eta..sub.2p.sub.j(x), . . . , .eta..sub.l/ap.sub.j(x), j.di-elect
cons.[a]}. From (21) and (23) we know that p.sub.j(x).di-elect
cons. so e can conclude that
rank ( { .eta. 1 p j ( .alpha. * ) , .eta. 2 p j ( .alpha. * ) , ,
.eta. / a p j ( .alpha. * ) : j .di-elect cons. [ a ] } ) = a rank
( { p 1 ( .alpha. * ) , p 2 ( .alpha. * ) , , p a ( .alpha. * ) } )
= ( 25 ) ##EQU00056##
satisfies (9), and for x.noteq..alpha.*
rank ( { .eta. 1 p j ( x ) , .eta. 2 p j ( x ) , , .eta. / a p j (
x ) : j .di-elect cons. [ a ] } ) = a rank ( { p 1 ( x ) , p 2 ( x
) , , p a ( x ) } ) .ltoreq. a ( a - s ) . ( 26 ) ##EQU00057##
Now from (10) we can calculate the repair bandwidth
b .ltoreq. a ( n - 1 ) ( a - s ) . ( 27 ) ##EQU00058##
[0110] Rather than directly using existing schemes, the polynomials
(12) and (20) uses a set of basis {.xi..sub.1, .xi..sub.2, . . . ,
.xi..sub.a} from E to B. Moreover, each polynomial is multiplied
with the basis for F over E to satisfy the full rank condition. In
this case, our embodiments and schemes significantly reduce the
repair bandwidth when the code length remains the same. Our
evaluation points are in a coset rather than the entire field F. It
should be noted that a here can be an arbitrary number that divides
l and when a=l. Note that the normalized repair bandwidth
b ( n - 1 ) ##EQU00059##
decreases as a decreases. Therefore, our scheme outperforms
existing schemes when applied to the case of >log.sub.q.
[0111] Example 1. Assume q=2, l=9, a=3 and E={0, 1,
.alpha.,.alpha.2, . . . , .alpha.6}. Let A=E*, n=7, k=5 so r=nk=2.
Choose s=log.sub.2 r=1 and W={0, 1} in Construction IL Then, we
have pj(x)=& .xi.j(x.alpha.*+.xi.j). Let {.xi.1, .xi.2, .xi.3}
be {1, .alpha.,.alpha.2}. It is easy to check that
rankB({p1(.alpha.*), p2(.alpha.*), p3(.alpha.*)})=3 and
rankB({p1(x), p2(x), p3(x)})=2 for x=6 .alpha.*. Therefore the
repair bandwidth is b=36 bits as suggested in Theorem 1. For the
same (n, k, l), the repair bandwidth in prior schemes may be 48
bits. For another example, consider RS (14, 10) code used in
Facebook, we have repair bandwidth of 52 bits for l=8, while the
prior scheme requires 60 bits and the naive scheme requires 80
bits.
[0112] According to one embodiment, the scheme constructs a scalar
code. This scalar code may be the first example of such a scalar
code in the art.
B. Schemes in Two Cosets
[0113] According to one embodiment, the scheme can include
evaluation points chosen from two cosets. In this scheme,
polynomials are chosen that have full rank when evaluated at the
coset containing the failed node, and rank 1 when evaluated at the
other coset.
Theorem 2. There exists an RS(n,k) code over =GF () with repair
bandwidth
b < ( n - 1 ) + a 2 ##EQU00060##
symbols over =GF(q), where q is a prime number and a satisfies
n .ltoreq. 2 ( q a - 1 ) , a | , a .ltoreq. n - k .
##EQU00061##
[0114] Proof: Assume a field =GF() is extended from =GF(q.sup.a)
and .beta. is the primitive element of . We focus on the code
RS(A,k) over of dimension k with evaluation points A consisting of
n/2 ponts from .beta..sup.m.sup.1* and n/2 points from
.beta..sup.m.sup.2*,
0 .ltoreq. m 1 < m 2 .ltoreq. q - 1 q a - 1 and m 2 - m 1 = q s
, s .di-elect cons. { 0 , 1 , , a } . ##EQU00062##
[0115] In this case we view as the base field and repair the failed
node f(.alpha.*) by
tr / ( .upsilon. .alpha. * p j ( .alpha. * ) f ( .alpha. * ) ) = -
i = 1 , .alpha. i .noteq. .alpha. * n tr / ( .upsilon. i p j (
.alpha. i ) f ( .alpha. i ) ) . ( 28 ) ##EQU00063##
Inspired by [6, Theorem 10], for
[0116] j .di-elect cons. [ a ] , ##EQU00064##
we choose
p j ( x ) = { ( x .beta. w 2 ) j - 1 , if .alpha. * .di-elect cons.
.beta. m 1 * , ( x .beta. w 1 ) j - 1 , if .alpha. * .di-elect
cons. .beta. m 2 * , ( 29 ) ##EQU00065##
The degree of p.sub.j(x) is smaller than r when
a .ltoreq. r . ##EQU00066##
Then,
[0117] we check the rank in each case.
[0118] When .alpha.*.di-elect cons..beta..sup.m.sup.1*, if
x=.beta..sup.m.sup.1.gamma..di-elect cons..beta..sup.m.sup.1*, for
some .gamma..di-elect cons.*,
p j ( x ) = ( x .beta. m 1 ) j - 1 = .gamma. i - 1 , ( 30 )
##EQU00067##
so
rank ( { p 1 ( x ) , p 2 ( x ) , , p a ( x ) } ) = 1. ( 31 )
##EQU00068##
If .beta..sup.m.sup.2.gamma..di-elect cons..beta..sup.m.sup.2*, for
some .gamma..di-elect cons.*,
p j ( x ) = ( x .beta. m 1 ) j - 1 = ( .beta. m 2 - m 1 ) j - 1
.gamma. j - 1 . ( 32 ) ##EQU00069##
Since m.sub.2-m.sub.1=q.sup.s and
{ 1 , .beta. , .beta. 2 , , .beta. a - 1 } ##EQU00070##
is the polynomial basis for over , from Lemma 2 we know that
rank ( { p 1 ( x ) , p 2 ( x ) , , p a ( x ) } ) = a . ( 33 )
##EQU00071##
When .alpha.*.di-elect cons..beta..sup.m.sup.1*, if
x=.beta..sup.m.sup.1.gamma..di-elect cons..beta..sup.m.sup.1* for
some .gamma..di-elect cons.*,
p j ( x ) = ( x .beta. m 2 ) j - 1 = ( .beta. m 1 - m 2 ) j - 1
.gamma. j - 1 = ( .beta. m 1 - m 2 ) 1 - a ( .beta. m 2 - m 1 ) a -
j .gamma. j - 1 . ( 34 ) ##EQU00072##
Since
[0119] ( .beta. m 2 - m 1 ) 1 - a ##EQU00073##
is a constant, from Lemma 2 we know that
rank ( { p 1 ( x ) , p 2 ( x ) , , p a ( x ) } ) = a . ( 35 )
##EQU00074##
If x=.beta..sup.m.sup.2.gamma..di-elect cons..beta..sup.m.sup.2*
for some .gamma..di-elect cons.*,
p j ( x ) = ( x .beta. m 2 ) j - 1 = .gamma. j - 1 , ( 36 )
##EQU00075##
so
rank ( { p 1 ( x ) , p 2 ( x ) , , p a ( x ) } ) = 1. ( 37 )
##EQU00076##
[0120] Therefore,
{ p j ( .alpha. * ) , j .di-elect cons. [ a ] } ##EQU00077##
has full rank over E, for any evaluation point .alpha.* .di-elect
cons.A. For x from the coset containing .alpha.*, the polynomials
have rank a, and for x from the other coset, the polynomials have
rank 1. Then, the repair bandwidth in symbols over B can be
calculated from (10) as
b = a ( n 2 - 1 ) log q + n 2 log q = ( n - 1 ) + a 2 - - a 2 <
( n - 1 ) + a 2 . ( 38 ) ##EQU00078##
[0121] Example 2. Take the RS (14, 11) code over F=GF(2.sup.12) for
example. Let .beta. be the primitive element in F, a=4, s=l/a=3 and
A=E*.orgate..beta.E*. Assume .alpha.*.di-elect cons..beta.E*, then
{pj(x), j.di-elect cons.[3]} is the set {1, x, x.sup.2}. It is easy
to check that when x.di-elect cons..beta.E* the polynomials have
full rank and when x.di-elect cons.E*the polynomials have rank 1.
The total repair bandwidth is 100 bits. For the same (n, k, l), the
repair bandwidth of our scheme in one coset is 117 bits. For prior
schemes, which only works for l/a=2, we can only choose a=6 and get
the repair bandwidth of 114 bits for the same (n, k, l).
C. Schemes in Multiple Cosets
[0122] In the schemes in this subsection, we extend an original
code to a new code over a larger field and the evaluation points
are chosen from multiple cosets in Lemma 1 to increase the code
length. The construction ensures that for fixed n, the
sub-packetization size is smaller than the original code. If the
original code satisfies several conditions to be discussed soon,
the repair bandwidth in the new code is only slightly larger than
that of the original code.
[0123] Particularly, if the original code is an MSR code, then we
can get the new code in a much smaller sub-packetization level with
a small extra repair bandwidth. Also, if the original code works
for any number of helpers and multiple erasures, the new code works
for any number of helpers and multiple erasures, too. We discuss
multiple erasures below.
[0124] We first prove a lemma regarding the ranks over different
base fields, and then describe the new code. [0125] Lemma 1. Let
=GF(q), '=GF(), =GF(q.sup.a), =GF(),=a'. a and ' are relatively
prime and q can be any power of a prime number. For any set of
{.gamma..sub.1,.gamma..sub.2, . . . , .gamma..sub.l'}'.ltoreq., we
have
[0125] ({.gamma..sub.1,.gamma..sub.2, . . . ,})
({.gamma..sub.1,.gamma..sub.2, . . . ,}) (39)
[0126] Proof: Assume ranks ({.gamma..sub.1,.gamma..sub.2, . . . ,
})=c and without loss of generality, {.gamma..sub.1,.gamma..sub.2,
. . . , .gamma..sub.c} are linearly in n-dent over . Then, we can
construct {.gamma..sub.c+1',.gamma..sub.c+2, . . . , }' to make
{.gamma..sub.1,.gamma..sub.2, . . . , .gamma..sub.c,
.gamma..sub.c+1',.gamma..sub.c+2', . . . , } form a basis for over
.
[0127] Assume we get F by adjoining .beta. to B. Then, {1,
.beta.,.beta.2, . . . , .beta.'1} is a basis for both F over E, and
F over B. So, any symbol y.di-elect cons.F can be presented as a
linear combination of {1,.beta.,.beta.2, . . . , .beta.'1} with
some coefficients in E. Also, there is an invertible linear
transformation with coefficients in B between {.gamma.1,.gamma.2, .
. . , .gamma.c, .gamma.0c+1, .gamma.0c+2, . . . ,.gamma.0'1} and
{1,.beta.,.beta.2, . . . , .beta.'1}, because they are a basis for
F0 over B. Combined with the fact that {1,.beta.,.beta.2, . . . ,
.beta.'1} is also a basis for F over E, we can conclude that any
symbol y.di-elect cons.F can be represented as
y=x.sub.1.gamma..sub.1+x.sub.2.gamma..sub.2+ . . .
+x.sub.c.gamma..sub.c+x.sub.c+1.gamma..sub.c+1'+ . . . + (40)
with some coefficients xi.di-elect cons.E, which means that
{.gamma..sub.1,.gamma..sub.2, . . .
,.gamma..sub.c,.gamma..sub.0c+1,.gamma..sub.0c+2, . . . ,
.gamma..sub.0'} is also a basis for F over E. Then, we have that
{.gamma..sub.=1,.gamma..sub.2, . . . , .gamma..sub.c} are linearly
independent over E,
({.gamma..sub.1,.gamma..sub.2, . . .
,}).gtoreq.=({.gamma..sub.1,.gamma..sub.2, . . . ,}) (41)
Since .ltoreq., we also have
({.gamma..sub.1,.gamma..sub.2, . . .
,})=({.gamma..sub.1,.gamma..sub.2, . . . ,}) (42)
[0128] Theorem 3. Assume there exists a RS(n',k') code over '=GF()
with evaluation points set A.sup.l. The evaluation points are
linearly independent over B=GF(q). The repair bandwidth is b and
the repair polynomials are p.sub.j'(x).
[0129] Then, we can construct a new RS(n, k) code E over =GF(), =a
with n=(q.sup.a-1)n', k=n-n'+k' and repair bandwidth of
b=ab'(q.sup.a-1)+(q.sup.a-2); symbols over B=GF(q) if we can find
new repair polynomials p.sub.j(x).di-elect cons.[x],j.di-elect
cons.[], with degrees less than n H k that satisfy
({p.sub.1(x),p.sub.2(x), . . .
,(x)})=({p.sub.1'(.alpha.),p.sub.2'(.alpha.), . . . ,(.alpha.)})
(43)
[0130] Proof: We first prove the case when a and are necessarily
relatively prime using Lemma 3, the case when a and ' are not
relatively prime are proved in Appendix A. Assume the evaluation
points of ' are A'={.alpha..sub.1,.alpha..sub.2, . . . ,
.alpha..sub.n'}, then from Lemma 3 we know that they are also
linearly independent over , so there does not exist
.gamma..sub.i,.gamma..sub.j.di-elect cons.* that satisfy
.alpha..sub.i.gamma..sub.i=.alpha..sub.j.gamma..sub.k, which
implies that {.alpha..sub.1*, .alpha..sub.2*, . . . ,
.alpha..sub.n'*} are distinct cosets. Then, we can extend the
evaluation points to be
A={.alpha..sub.1*,.alpha..sub.2*, . . . ,.alpha..sub.n'} (44)
and n=(q.sup.a-1)n'. We keep the same redundancy r=n'-k' for the
new code k=n-r.
[0131] For the new code we use p.sub.j(x).di-elect cons.[x],
j.di-elect cons.[] to repair the failed node f(.alpha.*)
tr / ( .upsilon. .alpha. * p j ( .alpha. * ) f ( .alpha. * ) ) = -
.alpha. .di-elect cons. A , .alpha. .noteq. .alpha. * tr / (
.upsilon. .alpha. p j ( .alpha. ) f ( .alpha. ) ) . ( 45 )
##EQU00079##
[0132] Assume the failed node is f(.alpha.*) and .alpha.*.di-elect
cons..alpha..sub.i*. Then, for the node x.di-elect
cons..alpha..sub.i*, because the original code satisfies the full
rank condition, we have
({p.sub.1(x),p.sub.2(x), . . .
,(x)})=({p.sub.1'(.alpha..sub.i),p.sub.2'(.alpha..sub.i), . . .
,(.alpha..sub.i)})= (43)
then we can recover the failed node with p.sub.j(x), and each
helper in the coset containing the failed node transmits ' symbols
over .
[0133] For helper in the other cosets, x.di-elect cons..alpha..sub.
*, .noteq.i, by (43),
({p.sub.1(x),p.sub.2(x), . . . ,(x)})=({p.sub.1'(.alpha..sub.
),p.sub.2'(.alpha..sub. ), . . . ,(.alpha..sub. )})= (47)
then evey helper in these cosets transmits
b ' n ' - 1 ##EQU00080##
symbols in on average.
[0134] The repair bandwidth of the new code can be calculated from
the repair bandwidth condition (10) as
b = b ' n ' - 1 ( n ' - 1 ) * a + ( * - 1 ) ' a = ab ' ( q a - 1 )
+ ( q a - 2 ) ( 48 ) ##EQU00081##
[0135] Note that the calculation in (48) and (38) are similar in
the sense that a helper in the coset containing the failure naively
transmits the entire stored information, and the other helpers use
the bandwidth that is the same as the original code. As a special
case of Theorem 3,
b ' = ' r ( n ' - 1 ) ##EQU00082##
matching the MSR bound (1), we get
b = r ( n - 1 ) + r ( r - 1 ) ( q a - 2 ) , ( 49 ) ##EQU00083##
where the second term is the extra bandwidth compared to the MSR
bound.
[0136] Next, we apply Theorem 3 to the near-MSR code and the MSR
code.
Theorem 4. There exists an RS(n,k) code over =GF() of which
n = ( q a - 1 ) log r a ##EQU00084##
and a|, such that the repair bandwidth satisfies
b < n - k [ n + 1 + ( n - k - 1 ) ( q q - 2 ) ] ,
##EQU00085##
measured in symbols over =GF(q) for some prime number q.
[0137] Proof: We first prove the case when a and are relatively
prime using Lemma 3, the case when a and ' are not necessarily
relatively prime are proved in Appendix A. We use the code in [16]
as the or original code. The original code is defined in '=GF ()
and =r.sup.n'. The evaluation points are
A ' = { .beta. , .beta. r , B r 2 , , B r n ' - 1 }
##EQU00086##
where .beta. is a primitive element of '.
[0138] In the original code, for c=0,1, 2, . . . , -1, we write its
r-ary expansion as c=(c.sub.n',c.sub.n'-1 . . . c.sub.1), where
0.ltoreq.c.sub.i.ltoreq.r-1 is the i-th digit from the right.
Assuming the failed node is f(.beta..sup.r.sup.i-1), the repair
polynomials are chosen to be
p.sub.j'(x)=.beta..sup.cx.sup.s,c.sub.i=0,s=0,1,2, . . .
,r-1,x.di-elect cons.'. (50)
Here e varies from 0 to -1 given that c=0, and s varies from 0 to
r-1. So, we have polynomials in total. The subscript j is indexed
by c and s, and by a small abuse of the notation, we write
j.di-elect cons.[].
[0139] In the new code, let us define =GF(q.sup.a) of which a and
are relatively prime. Adjoining .beta. to , we get =GF(),
.ltoreq.a. The new evaluation points are
A = { .beta. * , .beta. r * , B r 2 * , , B r n ' - 1 * } .
##EQU00087##
Since A' is part of the polynomial basis for over know that
{ .beta. , .beta. r , B r 2 , , B r n ' - 1 } ##EQU00088##
are linearly independent over , Hence, we can apply Lemma 3 and the
cosets are distinct resulting in
n = A = ( q .alpha. - 1 ) log r a . ##EQU00089##
[0140] In our new code, let us assume the failed node is
f(.alpha.*) and .alpha.*.di-elect cons..beta..sup.r.sup.i-1, and we
choose the polynomial p.sub.j(x) with the same form as
p.sub.j'(x),
p.sub.j(x)=.beta..sup.cx.sup.s,c.sub.i=0,s=0,1,2, . . .
,r-1,x.di-elect cons. (51)
[0141] For nodes corresponding to
x=.beta..sup.r.sup.i.gamma..di-elect cons.* , for some
.gamma..di-elect cons.*, we know that
p.sub.j(x)=.beta..sup.cx.sup.s=.beta..sup.c(.gamma..beta..sup.r.sup.t).s-
up.s=.gamma..sup.sp.sub.j'(.beta..sup.r.sup.t) (52)
Since p.sub.j'(.beta..sup.r.sup.t).di-elect cons.', from Lemma 3,
we have
rank ( { .gamma. s p 1 ' ( .beta. r t ) , .gamma. s p 2 ' ( .beta.
r t ) , , .gamma. s p ' ' ( .beta. r t ) } ) = rank ( { p 1 ' (
.beta. r t ) , p 2 ' ( .beta. r t ) , , p ' ' ( .beta. r t ) } ) =
rank ( { p 1 ' ( .beta. r t ) , p 2 ' ( .beta. r t ) , , p ' ' (
.beta. r t ) } ) , ( 53 ) ##EQU00090##
which satisfies (43). Since repair bandwidth of the original code
is
b ' < ( n ' + 1 ) ' r , ##EQU00091##
from (48) we can calculate the repair bandwidth as
b = ab ' ( q a - 1 ) + ( q a - 2 ) < r [ n + 1 + ( r - 1 ) ( q n
- 2 ) ] , ( 54 ) ##EQU00092##
where the second term is the extra bandwidth compared to the
original code.
[0142] Example 3 We take an RS(4, 2) code in GF(2.sup.16) as the
original code and extend it with a=3,|E*|=7 to an RS(28, 26) code
in GF(2.sup.48) with normalized repair bandwidth of
b ( n - 1 ) < 0.65 . ##EQU00093##
The RS(28, 26) code achieves the normalized repair bandwidth of
b ( n - 1 ) < 0.54 ##EQU00094##
while it requires =2.7.times.10.sup.8 Our scheme has a much smaller
l compared while the repair bandwidth is a bit larger.
[0143] In the above theorem, extending a priori scheme to a
linearly larger sub-packetization and an exponentially larger code
length, which means that for the same code length, we can have a
much smaller sub-packetization level.
[0144] Next, we show our second realization of the scheme in
multiple cosets. Different from the previous constructions, this
one allows any number of helpers, k.ltoreq.d.ltoreq.n1. The
sub-packetization size in the original code of a prior scheme
satisfies .apprxeq.(n').sup.m' when n' grows to infinity, thus in
our new code it satisfies
.apprxeq. a ( n ' ) n ' ##EQU00095##
for some integer a.
[0145] Theorem 5. Let q be a prime number. There exists an RS(n, k)
code over F=GF(q.sup.l) of which
l = asq 1 q 2 q n q a - 1 , ##EQU00096##
where q.sub.i is the i-th prime number that satisfies s(q.sub.i1),
s=dk+1 and a is some integer. d is the number of helpers,
k.ltoreq.d.ltoreq.(n1). The average repair bandwidth is
b = d ( n - 1 ) ( d - k + 1 ) [ n - 1 + ( d - k ) ( q a - 2 ) ]
##EQU00097##
measured in symbols over B=GF(q).
[0146] Proof: We first prove the case when a and l.sup.l are
relatively prime using Lemma 3, the case when a and l.sup.l are not
necessarily relatively prime. We use the code in as the original
code, where the number of helpers is d0. We set nk=n.sup.lk.sup.l
and calculate the repair bandwidth for d helpers from the original
code when d.sup.l=dk+k.sup.l. Let us define F.sub.q(.alpha.) to be
the field obtained by adjoining a to the base field B. Similarly,
we define F.sub.q(.alpha.1, .alpha.2, . . . , .alpha.n) for
adjoining multiple elements. Let be an element of order q.sub.i
over B. The code is defined in the field
=GF()=GF(q.sup.sq.sup.1.sup.q.sup.2.sup.. . . q.sup.n'.sup.) which
is the degree-s extension of Fq(.alpha.1, .alpha.2, . . .
.alpha.n'). The evaluation points are A.sup.l={.alpha.1, .alpha.2,
. . . , .alpha.n'}.
[0147] Assuming the failed node is f(.alpha.i) and the helpers are
chosen from the set R', |R'|=d', the base for repair is '.sub.i,
defined as '.sub.i.sub.q(.alpha..sub.j, j.di-elect cons.[n'],
j.noteq.i). The repair polynomials are
{.eta..sub.t,p.sub.j'(.alpha..sub.i), t.di-elect cons.[q.sub.i],
j.di-elect cons.[s]}, where
p j ' ( x ) = x j - 1 g ' ( x ) , j .di-elect cons. [ s ] , x
.di-elect cons. ' , ( 55 ) g ' ( x ) = .alpha. .di-elect cons. A /
( R ' { .alpha. i } ) ( x - .alpha. ) , x .di-elect cons. ' . ( 56
) ##EQU00098##
[0148] and .eta.t.di-elect cons.F.sup.l, t.di-elect cons.[q.sub.i],
are constructed in such that {.eta.tp.sup.lj(.alpha.i), t.di-elect
cons.[qi], j.di-elect cons.[s]} forms the basis for F.sup.l over
F.sup.li. The repair is done using
tr F ' / F i ' ( .upsilon. .alpha. i .eta. t p j ' ( .alpha. i ) f
' ( .alpha. i ) ) = - = 1 , .noteq. i n ' tr F ' / F i ' (
.upsilon. .eta. t p j ' ( .alpha. ) f ' ( .alpha. ) ) . ( 57 )
##EQU00099##
[0149] For x.di-elect cons./R.sup.l.orgate.{.alpha..sub.i},
p.sup.lj(x)=0, so no information is transmitted. The original code
reaches the MSR repair bandwidth
b ' = e .di-elect cons. R ' rank ' i ( { .eta. t p j ' ( .alpha. e
) : t .di-elect cons. [ q i ] , j .di-elect cons. [ s ] } ) = d ' '
d ' - k ' + 1 . ( 58 ) ##EQU00100##
[0150] According to one or more embodiments, codes can define
=GF(q.sup.a)=.sub.w(.alpha..sub.n+1) where a and l.sup.l are
relatively prime, and .alpha..sub.n+1 is an element of order a over
B. Adjoining the primitive element of F.sup.l to E, then =GF(), =.
The new code is defined in F. Extending the evaluation points to be
A={.alpha..sub.1*,.alpha..sub.2*, . . . , .alpha..sub.n*}. Since
{.alpha..sub.1,.alpha..sub.2, . . . .alpha..sub.n'} are linearly
independent over B, the Lemma 3 can be applied and cosets are
distinct. So, n=|A|=(q.sup.a-1)n'. Assuming the failed node is
f(.alpha.*) and .alpha.*.di-elect cons..alpha..sub.i* and the
helpers are chosen from the set R, |R|=d, the base filed for
repairs is .sub.i which is defined by .sub.i (.alpha..sub.j,
j.di-elect cons.[n+1],j.noteq.i), for i.di-elect cons.[n]. The
repair polynomials are defined as
{ n t p j ( x ) , t .di-elect cons. [ q i ] , j .di-elect cons. [ s
] } , where ##EQU00101## p j ( x ) = x j - 1 g ( x ) , j .di-elect
cons. [ s ] , x .di-elect cons. , ( 59 ) g ( x ) = .alpha.
.di-elect cons. A / ( R { .alpha. * } ) ( x - .alpha. ) , x
.di-elect cons. , ( 60 ) ##EQU00101.2##
and .eta.t is the same as that in the original code. Then, we
repair the failed node can be repaired by
tr / i ( .upsilon. .alpha. * .eta. t p j ( .alpha. * ) f ( .alpha.
* ) ) = - .alpha. .di-elect cons. A , .alpha. .noteq. .alpha. * tr
/ i ( .upsilon. .alpha. .eta. t p j ( .alpha. ) f ( .alpha. ) ) .
For x .di-elect cons. .alpha. * , .alpha. .di-elect cons. A ' , we
have ( 61 ) p j ( x ) = .gamma. j - 1 .alpha. j - 1 g ( x ) , j
.di-elect cons. [ s ] , ( 62 ) ##EQU00102##
for some .gamma..di-elect cons.E*. If x.di-elect cons./R
.uparw.{.alpha.*}, since g(x)=0, no information is transmitted from
node x. Next, we consider all other nodes.
[0151] For x=.alpha..gamma.,.alpha..di-elect cons.A', since g(x) is
a constant independent of j, .gamma..di-elect cons..sub.i and
.eta..sub.t, .alpha..sub.i.di-elect cons.', from Lemma 3 we
have
rank i ( { .eta. t p 1 ( x ) , .eta. t p 2 ( x ) , , .eta. t p s (
x ) : t .di-elect cons. [ q i ] } ) = rank i ( { .eta. t , .eta. t
.gamma. .alpha. , , .eta. t .gamma. s - 1 .alpha. s - 1 : t
.di-elect cons. [ q i ] } ) = rank i ( { .eta. t , .eta. t .alpha.
, , .eta. t .alpha. s - 1 : t .di-elect cons. [ q i ] } ) = rank i
' ( { .eta. t , .eta. t .alpha. , , .eta. t .alpha. s - 1 : t
.di-elect cons. [ q i ] } ) = rank i ' ( { .eta. t p 1 ' ( .alpha.
) , .eta. t p 2 ' ( .alpha. ) , , .eta. t p s ' ( .alpha. ) : t
.di-elect cons. [ q i ] } ) , ( 63 ) ##EQU00103##
which satisfies (43).
[0152] When k.ltoreq.d<nl, assuming the helpers are randomly
chosen from all the remaining nodes, the average repair bandwidth
for different choices of the helpers can be calculated as
b = d [ b ' a d ' n - 1 - ( q a - 2 ) n - 1 + ' a q a - 2 n - 1 ] (
64 ) = d - k + 1 + d n - 1 d - k + 1 ( d - k ) ( q a - 2 ) . ( 65 )
##EQU00104##
[0153] Here in (64) the second term corresponds to the helpers in
the failed node coset, the first term corresponds to the helpers in
the other cosets, and in (65) we used d'-k'=d-k.
[0154] In the case of d=nl the repair bandwidth of the code in
Theorem 5 can be directly calculated from (48) as
b = ab ' ( q .alpha. - 1 ) + ( q .alpha. - 2 ) = r ( n - 1 ) + r (
r - 1 ) ( q .alpha. - 2 ) ] . ( 66 ) ##EQU00105##
[0155] In (65) and (66), the second term is the extra repair
band-width compared to the original code.
[0156] In Theorems 4 and 5, we constructed our schemes by extending
previous schemes. However, it should be noted that since we only
used the properties of the polynomials p.sup.lj(x), we have no
restrictions on the dimensions k.sup.l of the original codes. So,
in some special cases, even if k.sup.l is negative and the original
codes do not exist, our theorems still hold. Thus, we can provide
more feasible points of (n, k) using our schemes. This is
illustrated in the example below.
Example 4. Let us take the RS(12,8) code as an example. We set q=2,
s=4 q.sub.1=5, q.sub.2=9, q.sub.3=13 and a=7. then =2340 and
=16380. Assuming the failed node is f(.alpha.*) and
.alpha.*.di-elect cons..alpha..sub.1C, then we repair it in .sub.1
and set the polynomials in (59). We can easily check that when
x.di-elect cons..alpha..sub.1C,
({.eta..sub.tp.sub.1(x),.eta..sub.tp.sub.2(x), . . . ,
.eta..sub.tq.sub.s(x):t.di-elect cons.[5]})=20 and when x in other
cosets, ({.eta..sub.tp.sub.1(x), .eta..sub.tp.sub.2(x), . . . ,
.eta..sub.tp.sub.s(x):t.di-elect cons.[5]})=5 Therefore; we
transmit 100 symbols in .sub.1, which can be normalized to
b ( n - 1 ) = 0.4545 . ##EQU00106##
As such as tradeoff is provided between l and b.
D. Numerical Evaluations and Discussions
[0157] Advantages of the embodiments may be shown relative to
existing proposed schemes. Table I shows the repair bandwidth and
the code length of each scheme. For the comparison, FIGS. 4 and 5
show the performance of each scheme when the sub-packetization
changes, given (n, k)=(12, 10) and (12, 8), respectively.
Considering only nl helpers. Two single points
( log 2 ( ) = 53.5 , b ( n - 1 ) = 0.50 ) ##EQU00107##
in RS (12, 10) codes and
( log 2 ( ) = 64.4 , b ( n - 1 ) = 0.25 ) ##EQU00108##
in RS (12,8) codes are not shown in the figures, they can be
achieved by both our second realization in multiple cosets. We make
the following observations.
[0158] FIG. 4 illustrates a graphical representation 400 showing
normalized repair bandwidth values that correspond to various
subpacketization sizes as a result of implementing different repair
schemes according to one or more embodiments described herein.
Specifically, FIG. 4 also depicts a relationship between various
subpacketization sizes and normalized repair bandwidths upon the
implementations of the following repair schemes: one coset scheme
401, two cosets scheme 402, and multiple cosets schemes 403 and
404. FIG. 4 also depicts a Full-length code scheme 405. FIG. 4 also
shows a comparison of 3 schemes, q=2, n=12, k=10, r=2, x-axis is
the log scale sub-packetization size, y-axis is the normalized
repair bandwidth. Moreover, the normalized repair bandwidth values
resulting from implementation of different schemes to correct a
single erasures in one coset and multiple cosets, and multiple
erasures in one coset and multiple cosets are shown in Table III
and Table IV.
[0159] For a fixed (n, k), we compare the normalized repair
bandwidth b/[(n-1)] in different sub-packetization sizes. At the
outset, it is noted that parameter a may be changed to adjust
sub-packetization size. Regarding a one coset scheme 401 and two
cosets scheme 402 depicted in FIG. 4, the parameter a is determined
by code length n, and will not be changed by increasing 1. In
addition, the normalized repair bandwidth will also remain
unchanged. Regarding the multiple cosets schemes 403 and 404, the
parameter a may be used to adjust the sub-packetization size. When
q=2, a=1, the two schemes in multiple cosets can coincides with
prior methods. From Theorems 4 and 5 we know that for the two
schemes,
= a r n q a - 1 and .apprxeq. a ( n q a - 1 ) ( n q a - 1 ) ,
##EQU00109##
respectively, which means that increasing a will decrease the
sub-packetization 1.
[0160] With respect to the Full-length code scheme 405, the
normalized repair bandwidth increases exponentially for
sub-packetization sizes ranging from 2 to 6. In addition, the
normalized repair bandwidth approaches a value of 1 for
sub-packetization sizes ranging from 8 to 16.
[0161] FIG. 5 illustrates a graphical representation 500 showing
normalized repair bandwidth values that correspond to various
subpacketization sizes as a result of implementing different repair
schemes. In particular, FIG. 5 depicts a relationship between
various subpacketization sizes and normalized repair bandwidths
upon the implementations of the following repair schemes: exemplary
one coset scheme 501, exemplary two cosets scheme 502, and
exemplary multiple cosets schemes 503 and 504. FIG. 5 also depicts
an exemplary Full-length code scheme 505. These schemes are
implemented according to one or more embodiments described herein.
FIG. 5 also illustrates a comparison of 3 schemes, q=2, n=12, k=8,
r=4, x-axis is the log scale sub-packetization size, y-axis is the
normalized repair bandwidth. Moreover, the normalized repair
bandwidth values resulting from implementation of different schemes
to correct a single erasures in one coset and multiple cosets, and
multiple erasures in one coset and multiple cosets are shown in
Table III and Table IV.
[0162] A prior scheme can achieve one tradeoff point in FIG. 5,
which can be viewed as a special case of our scheme in multiple
coset 1 (exemplary multiple coset scheme 503). For fixed n, k, our
schemes are better than the full-length code in prior schemes for
all l, except when l=4, for which our scheme in one coset
(exemplary one coset scheme 501) is identical to the full-length
code. While the repair bandwidth of the full-length code grows with
l, our schemes in one coset (exemplary one coset scheme 501) and
two cosets (exemplary two cosets scheme 502) have a constant
normalized bandwidth, and our schemes in multiple cosets (exemplary
multiple cosets schemes 503 and 504) have a decreasing normalized
bandwidth with l. For small l: the schemes in one coset and two
cosets are better than those in multiple cosets; when n=12, k=10,
4.ltoreq.l.ltoreq.48, the scheme in two cosets provides the lowest
bandwidth; when n=12, k=8, 4.ltoreq.l.ltoreq.768, one can show that
the scheme in one coset has the smallest bandwidth. For large 1 the
first realization in multiple cosets has better performance than
the second realization in multiple cosets, but our second
realization works for any number of helpers.
IV. Reed-Solomon Repair Schemes for Multiple Erasures
[0163] According to one or more embodiments, definitions of repair
schemes for multiple erasures in a MDS code: can include linear
repair scheme definition and dual code repair definition. We prove
the equivalence of the two definitions. Then, we present two
schemes for repairing multiple erasures in Reed-Solomon codes,
where the evaluation points are in one coset and multiple cosets,
respectively. The manner in which bandwidth is affected in multiple
era
A. Definitions of the Multiple-Erasure Repair
[0164] Let us assume a scalar MDS code over =GF() has dimension k
and code length n. Let a codeword be (C1, C2, . . . Cn). Without
loss of generality, we assume {C1, C2, . . . , Ce} are failed,
e.ltoreq.nk, and we repair them in the base field B=GF(q), where q
can be any power of a prime number. We also assume that we use all
the remaining d=ne nodes as helpers. The following definitions are
associated with single erasure.
[0165] Definition 1. A linear exact repair scheme for multiple
erasures consists of the following. [0166] 1) A set of queries
Q.sub.t for each helper C.sub.t, e+1.ltoreq.t.ltoreq.n. The helper
C.sub.t replies with {.gamma.C.sub.t,.gamma..di-elect
cons.Q.sub.t}. [0167] 2) For each failed node C.sub.i,i.di-elect
cons.[e], a linear repair scheme that computes
[0167] C i = m = 1 .lamda. im .mu. im , ( 67 ) ##EQU00110## [0168]
where {.mu..sub.i1,.mu..sub.i2, . . . } is a basis for over and
coefficients .lamda..sub.im.di-elect cons.are -linear combinations
of replies
[0168] .lamda. im = t = e + 1 n .gamma. .di-elect cons. Q t .beta.
im .gamma. t tr / ( .gamma. C t ) , ( 68 ) ##EQU00111## [0169] with
the coefficients .beta..sub.im.gamma.t.di-elect cons.. The repair
bandwidth is
[0169] b = t = e + 1 rank ( Q t ) . ( 69 ) ##EQU00112##
[0170] In the following definition, we consider el dual codewords
of E, and index them by i.di-elect cons.[e], j.di-elect cons.[],
denoted as (C.sub.ij1',C.sub.ij2', . . . , C.sub.ijn'). Since they
are dual codewords, .SIGMA..sub.t=1.sup.nC.sub.tC.sub.ijt'=0.
[0171] Definition 2. A dual code scheme uses a set of dual
codewords {(C.sub.ij1',C.sub.ij1', . . . , C.sub.ijn'):i.di-elect
cons.[e], j.di-elect cons.[} that satisfies: [0172] 1) The ill rank
condition: Vectors
[0172] V.sub.ij=(C.sub.ij1',C.sub.ij2', . . .
,C.sub.ije'),i.di-elect cons.[e],j.di-elect cons.[], (70) [0173]
are linearly independent over . [0174] 2) The repair bandwidth
condition:
[0174] b = t = e + 1 n rank ( { C ijt ' : i .di-elect cons. [ e ] ,
j .di-elect cons. [ ] } ) . ( 71 ) ##EQU00113##
[0175] We repair nodes [e] from the linearly independent
equations
.upsilon. = 1 e tr / ( C ij .upsilon. ' C .upsilon. ) = - t = e + 1
n tr / ( C ijt ' C t ) , i .di-elect cons. [ e ] , j .di-elect
cons. [ ] . ( 72 ) ##EQU00114##
Here we use the same condition names as the single erasure case,
but in this section, they are defined for multiple erasures.
Theorem 6. Definitions 1 and 2 are equivalent.
[0176] The equivalence of Definitions 1 and 2 follows similarly,
except that we need to first solve e failed nodes simultaneously
and then find out the form of each individual failure (67).
[0177] Remark 2. In this paper, we focus on repairing RS code and
apply Theorem 6 to RS code. Knowing that with the polynomial
p.sub.ij(x).di-elect cons.[x] for which the degrees are smaller
than n k,
(.nu..sub.1p.sub.ij(.alpha..sub.1),.nu..sub.2p.sub.ij(.alpha..sub.2),
. . . .nu..sub.np.sub.ij(.alpha..sub.n)) is the dual codeword of RS
(n, k), where .nu..sub.i, i.di-elect cons.[n] are non-zero
constants determined by the evaluation points set A. So, in RS
code, Definition 2 reduces to finding polynomials pij(x) with
degrees smaller than nk. In what follows we use
p.sub.ij(.alpha..sub.t) to replace the dual codeword symbol,
C.sub.ijt' in Definition 2 for RS code. One can easily show that
the constants .nu..sub.i,i.di-elect cons.[n] do not affect the
ranks in the full rank condition and the repair bandwidth
condition.
B. Multiple-Erasure Repair in One Coset
[0178] A scheme is provided for multiple erasures in one coset.
From Theorem 6, we know that finding the repair scheme for multiple
erasures in RS code is equivalent to finding dual codewords (or
polynomials) that satisfy the full rank condition and repair
bandwidth condition. Given a basis {.xi.1, .xi.2, . . . , .xi.'}
for F over B, we define some matrices as below. They are used to
help us check the two rank conditions according to Lemmas 4 and 5.
Let the evaluation points of an RS code over F be A={.alpha.1, . .
. , .alpha.n}. Let p.sub.ij(x), i.di-elect cons.[e], j.di-elect
cons.[] be polynomials over F, and B a subfield of F. Define
S it = [ tr / ( .xi. 1 p i 1 ( .alpha. t ) ) tr / ( .xi. p i 1 (
.alpha. t ) ) tr / ( .xi. 1 p i 2 ( .alpha. t ) ) tr / ( .xi. p i 2
( .alpha. t ) ) tr / ( .xi. 1 p i ( .alpha. t ) ) tr / ( .xi. p i (
.alpha. t ) ) ] ( 73 ) S = .DELTA. [ S 11 S 12 S 1 e S 21 S 22 S 2
e S e 1 S e 2 S ee ] ( 74 ) ##EQU00115##
Lemma 4. The following two statements am equivalent: [0179] 1)
vectors V.sub.ij=(p.sub.ij(.alpha..sub.1),p.sub.ij(.alpha..sub.2),
. . . , p.sub.ij(.alpha..sub.e)), i.di-elect cons.[e], j.di-elect
cons.[] an linearly independent over . [0180] 2) Matrix S in (74)
has full rank. Lemma 5. For t.di-elect cons.[n], consider S.sub.it
in (73),
[0180] rank ( [ S 1 t S 2 t S et ] ) = rank ( { p ij ( .alpha. t )
: i .di-elect cons. [ e ] , j .di-elect cons. [ ] } ) . ( 75 )
##EQU00116## [0181] Theorem 7. Let q be a prime number. There
exists an RS(n,k) code over =GF() of which n<q.sup.a,
q.sup.s.ltoreq.r and a such that the repair bandwidth for e
erasures is
[0181] b .ltoreq. e a ( n - e ) ( a - s ) ##EQU00117##
measured in symbols over for e satisfying
a .gtoreq. e ( e - 1 ) 2 ( a - s ) 2 . ##EQU00118## [0182] Proof:
We define a code over the field =GF() extended by =GF(q.sup.a),
where .beta. is the primitive element of . The evaluation points
are chosen to be A={.alpha..sub.1,.alpha..sub.2, . . . ,
.alpha..sub.n}*, which is one of the cosets in Lemma 1. Without
loss of generality we assume the e failed nodes are
{.alpha..sub.1,.alpha..sub.2, . . . , .alpha..sub.n}. The base
field is =GF(q). Construction III: We first consider the special
case when s=a-1. In this case, inspired by [19, Proposition 1], we
choose the polynomials
[0182] p ij ( x ) = .delta. i tr / ( .mu. j .delta. i ( x - .alpha.
i ) ) x - a i , i .di-elect cons. [ e ] , j .di-elect cons. [ a ] ,
( 76 ) ##EQU00119## [0183] where {.mu..sub.1,.mu..sub.2, . . . ,
.mu..sub.a} is the basis for over , and .delta..sub.i.di-elect
cons., i.di-elect cons.[e], are coefficients to be determined. From
[19, Theorem 3], we know that for
[0183] a > e ( e - 1 ) 2 , ##EQU00120##
there exists .delta..sub.i, i.di-elect cons.[e] such that
p.sub.ij(x) satisfy the full rank condition: the vectors
V.sub.ij=(p.sub.ij(.alpha..sub.1),p.sub.ij(.alpha..sub.2), . . . ,
p.sub.ij(.alpha..sub.e)), i.di-elect cons.[e], j.di-elect cons.[a]
are linearly independent over and the repair bandwidth
condition:
t = e + 1 n rank ( { p ij ( .alpha. t ) : i .di-elect cons. [ e ] ,
j .di-elect cons. [ a ] } ) = ( n - e ) e - e ( e - 1 ) ( q - 1 ) 2
. ( 77 ) ##EQU00121## [0184] Then, let {.eta..sub.1,.eta..sub.2, .
. . } be a set of basis for over we have the polynomials as
{.eta..sub.wp.sub.ij(x):w.di-elect cons.[/a], i.di-elect cons.[e],
j.di-elect cons.[a]}. Since {.eta..sub.1,.eta..sub.2, . . . , } are
linearly independent over and for any b.sub.ijw.di-elect cons.,
b.sub.ijwp.sub.ij(x).di-elect cons., we have
[0184] i , j , w b ijw .eta. w V ij = 0 i , j b ijw V ij = 0 ,
.A-inverted. w .di-elect cons. [ a ] . ( 78 ) ##EQU00122## [0185]
Also, we know that there does not exist nonzero b.sub.ijw.di-elect
cons. that satisfies .SIGMA..sub.i,jb.sub.ijwV.sub.ij=0, so we have
that vectors {.eta..sub.wV.sub.ij, w.di-elect cons.[/a], i.di-elect
cons.[e], j.di-elect cons.[a]} are also linearly
[0186] Construction IV: For s.ltoreq.a-1, consider the
polynomials
p ij ( x ) = .delta. i q s - 1 .mu. j = 1 q s - 1 ( x - ( .alpha. i
- w - 1 .mu. j .delta. i ) ) , j .di-elect cons. [ a ] , ( 80 )
##EQU00123## [0187] where {.mu..sub.1,.mu..sub.2, . . . ,
.mu..sub.a} is the basis for over , W={w.sub.0=w.sub.1,w.sub.2, . .
. , w.sub.q.sub.s.sub.-1} is an s-dimensional subspace in , s<a,
q.sup.s.ltoreq.r, and .delta..sub.i.di-elect cons., i.di-elect
cons.[e], are coefficients to be determined [0188] When x=a, we
have
[0188] p ij ( .alpha. i ) = .mu. i q s = 1 q s - 1 w - 1 . ( 81 )
##EQU00124##
[0189] Since
= 1 q s - 1 w - 1 ##EQU00125##
is a constant, from Lemma 2 we have
({p.sub.i1(.alpha..sub.1),p.sub.i2(.alpha..sub.i), . . .
,p.sub.ia(.alpha..sub.i)})=a. (82) [0190] For
x.noteq..alpha..sub.i, set x'=.alpha..sub.i-x we have
[0190] p ij ( x ) = .delta. i q s - 1 .mu. j = 1 q s - 1 ( w - 1
.mu. j .delta. i - x ' ) = .delta. i q s - 1 .mu. j = 1 q s - 1 ( w
- 1 x ' ) = 1 q s - 1 ( .mu. j .delta. i x ' - w ) = ( .delta. i x
' ) q s = 1 q s - 1 ( w - 1 ) = 0 q s - 1 ( .mu. j .delta. i x ' -
w ) . ( 83 ) ##EQU00126## [0191] independent over . So, from
Definition 2, we know that we can recover the failed nodes and the
repair bandwidth is
[0191] b = rank ( { .eta. 1 p ij ( x ) , , .eta. / a p ij ( x ) : i
.di-elect cons. [ e ] , j .di-elect cons. [ a ] } ) = a rank ( { p
ij ( x ) , i .di-elect cons. [ e ] , j .di-elect cons. [ a ] } ) =
a [ ( n - e ) e - e ( e - 1 ) ( q - 1 ) 2 ] . ( 79 )
##EQU00127##
[0192] As,
g ( y ) = f = q s - 1 ( y - w ) ##EQU00128##
is a linear mapping from E to itself with dimension a-s over B.
Since
( .delta. i x ' ) q s = 1 q s - 1 ( w - 1 ) ##EQU00129##
is a constant independent of J, we have
({p.sub.i1(x),p.sub.i2(x), . . . ,p.sub.ia(x)}).ltoreq.a-s, (84)
[0193] which means that p.sub.ij(x) can be written as
[0193] p ij ( x ) = .delta. i q s .upsilon. = 1 .alpha. - s .rho. j
.upsilon. .lamda. .upsilon. , ( 85 ) ##EQU00130##
where {.lamda..sub.1,.lamda..sub.2, . . . .lamda..sub.a-s} are
linearly independent over B, .rho..sub.j.nu..di-elect cons. and thy
are determined by .delta..sub.i, .mu..sub.j and
x-.alpha..sub.i.
[0194] From Lemma 4, we know that if the matrix S in (74) has full
rank, then we can recover the e erasures. It is difficult to
directly discuss the rank of the matrix, but assume that the
polynomials above satisfy the following two conditions: [0195] 1)
S.sub.ii, i.di-elect cons.[e] are identity matrices. [0196] 2) For
any fixed i.di-elect cons.[e],
[0196] S.sub.itS.sub.iy=,i>y>t. (86)
[0197] Then, it is easy to see that through Gaussian elimination, e
can transform the matrix S.sup.T to an upper triangle block matrix,
which has identify matrices in the diagonal. Hence, S has full
rank. Here, selecting {.xi..sub.1,.xi..sub.2, . . . , } to be the
dual basis of
{ .mu. 1 q s q s - 1 = 1 w - 1 , .mu. 2 q s = 1 q s - 1 w - 1 , ,
.mu. q s = 1 q s - 1 w - 1 } , so tr / ( .xi. m p ij ( .alpha. i )
) = { 0 , m .noteq. j , 1 , m = j . ( 87 ) ##EQU00131##
[0198] Therefore, Sii, i.di-elect cons.[e] are identity matrices.
We set .delta.1=1, and recursively choose .delta.i after choosing
{.delta.1, .delta.2, . . . , .delta.i1} to satisfy (86). Define
.delta..sub.i'=.delta..sub.i.sup.q*, and c.sub.mp to be the (m,
p)-th element in Sty for m,p.di-elect cons.[a]. Formula (86) can be
written as
m = 1 a c mp tr / ( .xi. m p ij ( .alpha. t ) ) = m = 1 a c mp
.upsilon. = 1 a - s b j .upsilon. tr / ( .xi. m .delta. i ' .lamda.
.upsilon. ) = 0 , .A-inverted. j .di-elect cons. [ a ] , ( 88 )
##EQU00132## [0199] where .lamda..sub..nu., .nu..di-elect
cons.[a-s] are determined by
[0199] m = 1 a c mp tr / ( .xi. m p ij ( .alpha. t ) ) = m = 1 a c
mp .upsilon. = 1 a - s b j .upsilon. tr / ( .xi. m .delta. i '
.lamda. .upsilon. ) = 0 , .A-inverted. j .di-elect cons. [ a ] , .
( 88 ) ##EQU00133##
[0200] Equation (88) is satisfied if
m = 1 a c mp tr / ( .xi. m .delta. i ' .lamda. .upsilon. ) = 0 ,
.upsilon. .di-elect cons. [ a - s ] , p .di-elect cons. [ a ] . (
89 ) ##EQU00134##
[0201] As a special case of Lemma 5, we have
rank(S.sub.ty)=({p.sub.tj(.alpha..sub.y),j.di-elect cons.[]}).
(901)
[0202] Then, from (84) we know that the rank of Sty is at most as,
which means in (89) we only need to consider p corresponding to the
independent as columns of Sty. So, (89) is equivalent to (as).sup.2
linear requirements. For .delta..sub.i'.di-elect cons., we can view
it as a unknowns over B, and we have
( 2 e - i ) ( i - 1 ) 2 ( a - s ) 2 .ltoreq. e ( e - 1 ) 2 ( a - s
) 2 ( 91 ) ##EQU00135##
linear requirements over B according to (86). Also knowing
.delta..sub.i', we can solve
.delta..sub.i=.delta..sub.i.sup.q.sup.l=.delta.'.sub.i.sup.q.sup.l-s.
Therefore, we can find appropriate {.delta.1, .delta.2, . . .
.delta.e} to make matrix S full rank when
a .gtoreq. e ( e - 1 ) 2 ( a - s ) 2 . ( 92 ) ##EQU00136##
[0203] Then, let {.eta.1, .eta.2, . . . , .eta.'/a} be a basis for
F over E, we have the el polynomials as {.eta..sub.wp.sub.ij(x),
w.di-elect cons.[/a], i.di-elect cons.[e], j.di-elect cons.[a]}.
Similar to Construction III, we know that vectors
{.eta..sub.wV.sub.ij, w.di-elect cons.[/a], i.di-elect cons.[e],
j.di-elect cons.[a]} are linearly independent over B. Therefore, we
can recover the failed nodes and the repair bandwidth is
b = rank ( { .eta. 1 p ij ( x ) , , .eta. / a p ij ( x ) : i
.di-elect cons. [ e ] , j .di-elect cons. [ a ] } ) = a rank ( { p
ij ( x ) : i .di-elect cons. [ e ] , j .di-elect cons. [ a ] } )
.ltoreq. a ( n - e ) ( a - s ) . ( 93 ) ##EQU00137##
[0204] In our scheme, we have constructions for arbitrary a, s,
such that a|, s.ltoreq.a-1, while the existing schemes mainly
considered the special case l=a. It should be noted that the scheme
in [19] can also be used in the case of s=al over E with repair
bandwidth
( n - e ) e - e ( e - 1 ) ( q - 1 ) 2 . ##EQU00138##
And, with lla copies of the code, it can also reach the same repair
bandwidth of our scheme. However, by doing so, the code is a vector
code, but our scheme constructs a scalar code.
C. Multiple-Erasure Repair in Multiple Cosets
[0205] Recall that the scheme in Theorem 5 for a single erasure is
a small sub-packetization code with small repair bandwidth for any
number of helpers. When there are e erasures and d helpers,
e.ltoreq.nk, k.ltoreq.d.ltoreq.ne, we can recover the erasures one
by one using the d helpers. However, the repaired nodes can be
viewed as additional helpers and thus we can reduce the total
repair bandwidth. Finally, for every helper, the transmitted
information for different failed nodes has some overlap, resulting
in a further bandwidth reduction.
[0206] FIG. 6 shows erasure locations in various data packets
according to one or more embodiments. In particular, FIG. 6 depicts
the number of erasures or failures present in various cosets. In
one embodiment, an erasure may be present in each of h.sub.1 cosets
601, h.sub.2 cosets 602, and h.sub.e cosets 603.
[0207] According to one embodiment, using an original code, the
code is extended to a new code with evaluation points as in (44).
If a helper is in the same coset as any failed node, it transmits
naively its entire data; otherwise, it transmits the same amount as
the scheme in the original code. After the extension, the new
construction decreases the sub-packetization size for fixed n, and
the bandwidth is only slightly larger than the original code. The
location of the e erasures are described by h.sub.i, i.di-elect
cons.[e], where 0.ltoreq.h.sub.i.ltoreq.e,
h.sub.1.gtoreq.h.sub.2.gtoreq. . . . .gtoreq.h.sub.e,
.SIGMA..sub.i=1.sup.eh.sub.i=e. We assume the erasures are located
in h.sub.1 cosets 601, and after removing one erasure in each
coset, the remaining erasures are located in h.sub.2 cosets 602.
Then, for the remaining erasures, removing one in each coset, we
get the rest of erasures in h.sub.e cosets, and so on. FIG. 6 also
shows the erasure locations described above.
[0208] In our scheme, we first repair h.sub.1 failures, one from
each of the h.sub.1 cosets. Then, for 2.ltoreq.i.ltoreq.e, we
repeat the following: After repairing h.sub.1, h.sub.2, . . . ,
.sub.1 failures, we view these repaired nodes as helpers and repair
next hi failures, one from each of the h cosets. The repair
bandwidth of the scheme is showed in the following theorem. [0209]
Theorem 8. Let q be a prime number. There exists an RS(n,k) code
over =GF() for which
[0209] = asq 1 q 2 q n q a - 1 , ##EQU00139##
where q.sub.i is the i-th prime number that satisfies
s|(q.sub.i-1), s=(n-k)! and a is an integer. For e erasures and d
helpers, e.ltoreq.n-k, k.ltoreq.d.ltoreq.n-e the average repair
bandwidth measured in symbols over is
b .ltoreq. ( n - e ) [ ( h 1 ( q a - 1 ) - e ) + ( n - h 1 ( q a -
1 ) ) i = 1 e h i d - k + .upsilon. = 1 i h .upsilon. ] , ( 94 )
##EQU00140##
Where h.sub.i, i.di-elect cons.[e] are the parameters that define
the location of erasures in FIG. 6 [0210] Proof: We first prove the
case when a and are relatively prime using Lemma 3, the case when a
and are not necessarily relatively prime are proved in Appendix A.
We use the code in [22] as the original code. Let .sub.q(.alpha.)
be the field obtained by adjoining .alpha. to the base field
=GF(q). Similarly let .sub.q(.alpha..sub.1,.alpha..sub.2, . . . ,
.alpha..sub.n) be the field for adjoining multiple elements. Let
.alpha..sub.i be an element of order q.sub.i over and h be the
number of erasures in the original code. The origial code is
defined in the field '=GF()=GF(q.sup.sq.sup.1.sup.q.sup.2.sup.. . .
q.sup.n'.sup.), which is the degree-s of extension of
(.alpha..sub.1,.alpha..sub.2, . . . , .alpha..sub.n'). The
evaluation points are A'={.alpha..sub.1,.alpha..sub.2, . . . ,
.alpha..sub.n}. The subfield '.sub.[h] is defined as
'.sub.[h]=.sub.q(.alpha..sub.j, j=h+1, h+2, . . . , n'), and
'.sub.i is defined as q(.alpha.j.noteq.i, j.di-elect cons.[n']).
[0211] In the original code, we assume without loss of generality
that there are h failed nodes f'(.alpha..sub.1), f'(.alpha..sub.2),
. . . , f'(.alpha..sub.h). Consider the polynomials for failed node
f'(.alpha..sub.i), 1.ltoreq.i.ltoreq.h. as
[0211] p.sub.ij'(x)=x.sup.j-1g.sub.i'(x),j.di-elect
cons.[s.sub.i],x.di-elect cons.', (95)
[0212] where
g i ' ( x ) = .alpha. .di-elect cons. A ' / ( R ' { .alpha. i ,
.alpha. i + 1 , .alpha. h } ) ( x - .alpha. ) , x .di-elect cons. '
, ( 96 ) ##EQU00141## [0213] for R'A', |R|=d' being the set of
helpers. The set of reapair polynomials are
{.eta..sub.itp.sub.ij'(z), i.di-elect cons.[h], j.di-elect
cons.[s.sub.i], t.di-elect cons.
[0213] [ sq i s i ] } , ##EQU00142##
where .eta..sub.it.di-elect cons.' are constructed in [22] to
ensure that {.eta..sub.itp.sub.i1'(.alpha..sub.i),
.eta..sub.itp.sub.i2'(.alpha..sub.i), . . . ,
.eta..sub.itp.sub.is.sub.i'(.alpha..sub.i)} forms the basis for '
over '.sub.i. [0214] Then, the failed nodes are repaired one by one
from
[0214] tr ' / i ' ( .upsilon. .alpha. , .eta. it p ij ' ( .alpha. i
) f ' ( .alpha. i ) ) = - e = 1 , e .noteq. i n tr ' / i ' (
.upsilon. e , .eta. it p ij ' ( .alpha. e ) f ' ( .alpha. e ) ) . (
97 ) ##EQU00143## [0215] For xR.orgate.{.alpha..sub.1,
.alpha..sub.i+1, . . . , .alpha..sub.h}, p.sub.ij'(z)=0 and no
information is transmitted. Once f'(a) is recovered, it is viewed
as a new helper for the failures i+1,i+2, . . . , h. [0216] Since
.sub.[h].ltoreq.'.sub.i, the information transmitted from the
helper .alpha..sub.t can be represented as
[0216] tr ' / i ' ( .upsilon. e , .eta. it p ij ' ( .alpha. e ) f '
( .alpha. e ) ) = tr ' / i ' ( .xi. im ' m = 1 q i ' tr i ' / [ h ]
' ( .upsilon. .eta. it .xi. im p ij ' ( .alpha. ) f ' ( .alpha. ) )
) = m = 1 q i ' .xi. im ' tr i ' / [ h ] ' ( .upsilon. .eta. it
.xi. im p ij ' ( .alpha. ) f ' ( .alpha. ) ) , ( 98 ) ##EQU00144##
[0217] Where
[0217] q i ' = q 1 q 2 q h q i , ##EQU00145##
{.xi..sub.i1,.xi..sub.i2, . . . .xi..sub.iq.sub.i.sub.'} and
{.xi..sub.i1',.xi..sub.i2', . . . .xi..sub.iq.sub.i.sub.'} are the
dual basis for ' over .sub.[h]. We used the fact that , (( ))=( ),
for '.sub.[h].ltoreq.F'.sub.i.ltoreq.'.
[0218] The original code satisfies the full rank condition for
every i E [h], and each helper a transmits
rank [ h ] ' ( { .eta. ih .xi. im p ij ' ( .alpha. ) : i .di-elect
cons. [ h ] j .di-elect cons. [ s i ] , t .di-elect cons. [ sq s i
] , m .di-elect cons. [ q i ' ] } ) = rank [ h ] ' ( { .eta. it
.xi. im : i .di-elect cons. [ h ] , t .di-elect cons. [ sq i s i ]
, m .di-elect cons. [ q i ' ] } ) = h ' ( d ' - k ' + h ) .upsilon.
= h + 1 n ' p .upsilon. ( 99 ) ##EQU00146##
[0219] symbols over '.sub.[h], which achieves the MSR bound.
[0220] In our new code, we extend field to =GF (),= by adjoining an
order-a element .alpha..sub.n+1 to F. We set d-k=d'-k'. The new
evaluation points consist of A={.alpha..sub.1*,.alpha..sub.2*, . .
. , .alpha..sub.n'*}, =GF(q.sup.a)=.sub.q(.alpha..sub.n+1). The
subfield F[h] is defined by adjoining .alpha..sub.n+1 to '.sub.[h],
and .sub.i, is defined as .sub.q(.alpha..sub.j, j.noteq.i,
j.di-elect cons.[n+1]).
[0221] Assume first that each coset contains at most one failure,
and there are h failures in total. We assume without loss of
generality that the evaluation points of the h failed nodes are in
{.alpha..sub.1 *,.alpha..sub.2*, . . . , .alpha..sub.h*} and they
are .alpha..sub.1.gamma..sub.1,.alpha..sub.2.gamma..sub.2, . . . ,
.alpha..sub.h.gamma..sub.h for some .gamma..sub.w.di-elect cons.,
w.di-elect cons.[h]. Let the set of helpers be RA, |R|=.alpha.. We
define the polynomials as
p.sub.ij(x)=x.sup.j-1g.sub.i(x),j.di-elect
cons.[s.sub.i],x.di-elect cons., (100)
[0222] where
g i ( x ) = .alpha. .di-elect cons. A / ( R { .alpha. i .gamma. i ,
.alpha. i + 1 .gamma. i + 1 , .alpha. h .gamma. h } ) ( x - .alpha.
) , x .di-elect cons. . ( 101 ) ##EQU00147##
[0223] The set of repair polynomials are {.eta..sub.itp.sub.ij(x),
i.di-elect cons.[h], j.di-elect cons.where .eta..sub.it.di-elect
cons. are the dame as the original construction. We use field
F.sub.i as the base filed for the repair.
tr / i ( .upsilon. .alpha. i .gamma. i .eta. it p ij ( .alpha. i
.gamma. i ) f ( .alpha. i .gamma. i ) ) = - a .di-elect cons. A ,
.alpha. .noteq. .alpha. i .gamma. i tr / i ( .upsilon. .alpha.
.eta. it p ij ( .alpha. ) f ( .alpha. ) ) . ( 102 )
##EQU00148##
[0224] If z.di-elect
cons.R.orgate.{.alpha..sub.i.gamma..sub.i,.alpha..sub.i+1.gamma..sub.i+1,
. . . , .alpha..sub.h.gamma..sub.h}, p.sub.ij(x)=0 and no
information is transmitted. Next, we consider all other nodes. If
x=.alpha..sub.i.gamma. for some .gamma..di-elect cons., we have
p.sub.ij(x)=.gamma..sup.j-1.alpha..sub.i.sup.j-1g.sub.i(x).
(103)
[0225] Since .eta..sub.it, .alpha..sub.i.di-elect cons.' and
g.sub.i(x) is a constant independent of j, we have
rank i ( { .eta. it p i 1 ( x ) , , .eta. it p is i ( x ) : t
.di-elect cons. [ sq i s i ] } ) = rank i ( { .eta. it , , .eta. it
.alpha. i s - 1 : t .di-elect cons. [ sq i s i ] } ) = rank i ' ( {
.eta. it p i 1 ' ( .alpha. i ) , , .eta. it p is i ' ( .alpha. i )
: t .di-elect cons. [ sq i s i ] } ) ( 104 ) ##EQU00149##
Which indicates the full rank. Note that the last equation follows
from Lemma 3. As a result we can recover the failed nodes and each
helper in the cosets containing the failed nodes transmit l symbols
in B.
[0226] For x=.alpha..sub. .gamma., >h, since .sub.[h] is a
subfield of .sub.i and from Lemma 3 we know that
{.xi..sub.i1,.xi..sub.i2, . . . , .xi..sub.iq.sub.i'} and
{.xi..sub.i1',.xi..sub.i2', . . . , .xi..sub.iq.sub.i'} are also
the dual basis for .sub.i over .sub.[h], then similar to (98) we
have
tr / i ( .upsilon. .alpha. .eta. it p ij ( x ) f ( x ) ) = m = 1 q
i ' .xi. im ' tr / [ h ] ( .upsilon. .eta. it .xi. im p ij ( x ) f
( x ) ) . ( 105 ) ##EQU00150##
[0227] Using the fact that gi(x) is a constant independent of j,
x.di-elect cons..sub.[h] and .eta..sub.it.xi..sub.im.di-elect
cons.', from Lemma 3 we know that
rank [ h ] ( { .eta. it .xi. im p ij ( x ) : i .di-elect cons. [ h
] , j .di-elect cons. [ s i ] , t .di-elect cons. [ sq i s i ] , m
.di-elect cons. [ q i ' ] } ) = rank [ h ] ( { .eta. it .xi. im : i
.di-elect cons. [ h ] , t .di-elect cons. [ sq i s i ] , m
.di-elect cons. [ q i ' ] } ) = rank [ h ] ' ( { .eta. it .xi. im :
i .di-elect cons. [ h ] , t .di-elect cons. [ sq i s i ] , m
.di-elect cons. [ q i ' ] } ) = rank [ h ] ' ( { .eta. it .xi. im p
ij ' ( .alpha. ) : i .di-elect cons. [ h ] , j .di-elect cons. [ s
i ] , t .di-elect cons. [ sq i s i ] , m .di-elect cons. [ q i ' ]
} ) = h ' ( d - k + h ) .upsilon. = h + 1 n ' p .upsilon. , ( 106 )
##EQU00151##
where the last equality follows from (99) and d'-k'=d-k. So, each
helper in the other cosets transmits
h d - k + h ##EQU00152##
symbols over B.
[0228] Using the above results, we calculate the repair bandwidth
in two steps.
[0229] Step 1. We first repair h1 failures, one from each of the h1
cosets. From (104), we know that in the h cosets containing the
failed nodes, we transmit ' symbols over B. By (106), for each
helper in other cosets, we transmit
h d - k + h 1 ##EQU00153##
symbols over B.
[0230] Step 2. For 2.ltoreq.i.ltoreq.e, repeat the following. After
repairing h1, h2, . . . , hi1 failures, these nodes can be viewed
as helpers for repairing next hi failures, one from each of the hi
cosets. So, we have
d + .upsilon. = 1 i - 1 h .upsilon. ##EQU00154##
helpers for the hi failures. v=1
[0231] For the helpers in the h cosets containing the failed nodes,
we already transmit ' symbols over B in Step 1 and no more
information needs to be transmitted. For each helper in other
cosets, we transmit
d + .upsilon. = 1 i - 1 h .upsilon. ##EQU00155##
symbols over B.
[0232] Thus, we can repair all the failed nodes. The repair
bandwidth can be calculated as (94).
[0233] Suppose that e failures are to be recovered. Compared to the
naive strategy which always uses d helpers to repair the failures
one by one, our scheme gets a smaller repair bandwidth since the
recovered failures are viewed as new helpers and we take advantage
of the overlapped symbols for repairing different failures.
[0234] In the case when n>>e(q.sup.a-1), or when we arrange
nodes with correlated failures in different cosets, we can assume
that all the erasures are in different cosets, h1=e, h2=h3= . . .
=he=0. For example, if correlated failures tend to appear in the
same rack in a data center, we can assign each node in the rack to
a different coset. Under such conditions, we simplify the repair
bandwidth as
b .ltoreq. d n - e d - k + e ( n - e + ( d - k ) ( q a - 2 ) ) . (
107 ) ##EQU00156##
[0235] Indeed, one can examine the expression of (94). With the
constraint that .SIGMA..sub.i=1.sup.e h.sub.i=e, the first term is
an increasing function of h1 and the second term
( n - h 1 ( q a - 1 ) ) i = 1 e h i d - k + .upsilon. = 1 i h
.upsilon. ##EQU00157##
is a decreasing function of h1. Under the assumption that n is
large, the second term dominates, and increasing h1 reduces the
total repair bandwidth b. Namely, h1=e corresponds to the lowest
bandwidth for large code length.
[0236] In particular, when d=ne, h1=e, we have
b = n - k ( n - e ) + n - k ( n - k - e ) ( q a - 2 ) , ( 108 )
##EQU00158##
where the second term is the extra repair bandwidth com-pared with
the MSR bound.
Numerical Evaluations and Discussions
[0237] In this subsection, we compare our schemes for multiple
erasures with previous results, including separate repair and
schemes. Repairing multiple erasures simultaneously can save repair
bandwidth compared to repairing erasures separately. Assuming e
failures happen one by one, and the rest of n 1 nodes are available
as helpers initially when the first failure occurs. We can either
repair each failure separately using n 1 helpers, or wait for e
failures and repair all of them simultaneously with n e helpers.
Table III shows a comparison. For our scheme in one coset, separate
repair needs a repair bandwidth of
a ( n - 1 ) ( a - s ) , ##EQU00159##
symbols in B, simultaneous a repair requires a bandwidth of
a ( n - e ) ( a - s ) . ##EQU00160##
For our scheme in multiple cosets, we can repair the failures
separately by n1 helpers with the bandwidth of
n - k [ n - 1 + ( n - k - 1 ) ( q a - 2 ) ] , ##EQU00161##
and with simultaneous repair we can achieve the bandwidth of
n - k [ n - e + ( n - k - e ) ( q a - 2 ) ] . ##EQU00162##
One can see that in both constructions, simultaneous repair
outperforms separate repair.
[0238] Next we compare our scheme for multiple erasures with the
existing schemes.
[0239] FIG. 4 shows the normalized repair bandwidth for different
schemes when n=16, k=8, e=2, q=2. Table IV shows the comparison
when n=64, k=32, e=2, q=2. We make the following observations:
[0240] 1) For fixed (n, k) and our scheme with multiple cosets, we
use the parameter a to adjust the sub-packetization size. From
Theorem 8, we know that l.apprxeq.a
[0240] ( n q a - 1 ) ( n q a - 1 ) ##EQU00163##
which means that increasing a will de-crease the sub-packetization
'. In our schemes with one coset and two cosets, the parameter a is
determined by the code length n, so increasing l will not change a
or the normalized repair bandwidth. [0241] 2) When l grows larger
(4<<2.1=10.sup.7 in FIG. 4 6<<3.3.times.10.sup.11 in
Table IV) the embodiments have the smallest repair bandwidth.
[0242] 3) For extremely large in FIG. 4,
.gtoreq.3.3.times.10.sup.11 in Table V) the embodiments in multiple
cosets have the smallest repair bandwidths.
[0243] Table III illustrates Repair bandwidth of different schemes
for e erasures
TABLE-US-00003 TABLE III repair bandwidth number of helpers
Single-erasure repair in one coset (separate repair) e a ( n - 1 )
( a - s ) ##EQU00164## n - 1 Multiple-erasure repair in one coset
(simultaneous repair) e a ( n - e ) ( a - s ) ##EQU00165## n - e
Single-erasure repair in mulitple cosets (separate repair) e n - k
[ n - 1 + ( n - k - 1 ) ( q a - 2 ) ] ##EQU00166## n - 1
Multiple-erasure repair in mulitple cosets (simultaneous e n - k [
n - e + ( n - k - e ) ( q a - 2 ) ] ##EQU00167## n - e repair)
[0244] Table IV shows normalized repair bandwidth,
b ( n - e ) ) ##EQU00168##
for different schemes when n=64, k=32, e=2, q=2.
TABLE-US-00004 TABLE IV = 6 = 7 = 8 = 9 . . . = 3.6 .times. 10 =
3.3 .times. 10.sup.11 = 3.9 .times. 10 Normalized bandwidth 0.42
0.50 0.52 0.52 . . . 0.52 0.52 0.52 for Scheme 1 in [19] Normalized
bandwidth 0.49 0.49 0.49 0.49 . . . 0.49 0.49 0.49 for our scheme
in one coset Normalized bandwidth 0.52 0.48 0.0625* for our scheme
in multiple cosets indicates data missing or illegible when
filed
[0245] FIG. 7 illustrates a graph representation showing normalized
repair bandwidth values that correspond to various subpacketization
sizes as a result of implementing different repair schemes that are
in accordance with one or more embodiments described herein.
Specifically, FIG. 7 depicts a relationship between various
subpacketization sizes and normalized repair bandwidths upon the
implementations of the following repair schemes: current scheme in
one coset 701, current scheme in multiple cosets 702, MBW Scheme
703. The MBW scheme 703 relates to scheme 2, which is designed by
Mardia, Bartan, and Wootters. FIG. 7 also illustrates a comparison
of the schemes, q=2, n=16, k=8, e=2, x-axis is the log scale
sub-packetization size, y-axis is the normalized repair bandwidth.
Moreover, the normalized repair bandwidth values resulting from
implementation of different schemes to correct a single erasures in
one coset and multiple cosets, and multiple erasures in one coset
and multiple cosets are shown in Table III and Table IV.
[0246] As described above, three Reed-Solomon code repair schemes
are provided that include a tradeoff between the sub-packetization
size and the repair bandwidth. Our schemes (current scheme in one
coset 701 and current scheme in multiple cosets 702) choose the
evaluation points of the Reed-Solomon code from one, two, or
multiple cosets of the multiplicative group of the underlying
finite field. For a single erasure, when the sub-packetization size
is large, the scheme in multiple cosets has better performance, it
approaches the MSR bound. When sub-packetization size is small, the
scheme in one coset has advantages in repair bandwidth. The scheme
in two cosets has smaller repair bandwidth with certain parameters
in between the other two cases. For multiple erasures, our scheme
in one coset has constructions for arbitrary redundancy nk and our
scheme in multiple cosets reduced the sub-packetization size of an
MSR code. The two schemes together provided a set of tradeoff
points and we observe similar tradeoff characteristics as in the
single erasure case. In spite of several tradeoff points we
provided in this paper, the dependence of the sub-packetization
size versus the repair bandwidth is still an open question.
[0247] FIG. 8 illustrates a process 800 for error correction of
distributed data based on modifying a Reed-Solomon correction code
according to one or more embodiments described herein.
[0248] According to one embodiment, process 800 may be initiated at
block 801 when a device (e.g., repair device 120 depicted in FIG.
1) receives a data request from one or more client devices (e.g.,
one or more of client devices 115.sub.1-n).
[0249] At block 802, process 800 includes receiving a first
correction code for a plurality of data fragments stored in a
plurality of storage nodes. The first correction code is a
Reed-Solomon code that has a data symbol for each of the plurality
of storage nodes and is represented as a polynomial over a finite
field with a sub-packetization size.
[0250] At block 803, process 800 includes constructing a second
correction code that is constructed in response to at least one
unavailable storage node of the plurality of storage nodes. The
second correction code is represented as a second polynomial over a
second finite field and has an increased subpacketization size
relative to the first polynomial. In one embodiment, the second
correction code is a new Reed-Solomon code. The second correction
code may be constructed using one or more operations discussed
relative to block 210 in process 200 described above. Moreover, it
is noted that the unavailability may be due to an erasure error in
one of the plurality of data fragments. An erasure error may occur
if a storage location of a particular symbol included in one of the
data fragments stored in the storage nodes is unknown.
[0251] At block 804, process 800 includes performing erasure repair
using the second correction code. In one embodiment, the second
correction code is applied to available data fragments of the
plurality of data fragments for erasure repair. Additionally, the
erasure error utilizes at least one coset of a multiplicative group
of the second finite field. In one embodiment, repairing the
erasure includes using a first dual codeword and a second dual
codeword as repair polynomials. In this embodiment, a trace
function is combined with the dual code words to generate
fragments. In one embodiment, the trace function is used to obtain
subfield symbols, as described in the Preliminaries section below.
Moreover, the use of the trace function in combination with dual
codewords as described in the Reed-Solomon Repair Schemes for
Multiple Erasures section. The erasure repair may be performed
using one or more operations discussed relative to block 215 in
process 200 described above.
[0252] At block 805, process 800 includes outputting a corrected
data fragment based on the erasure repair that is performed in
block 215. In one embodiment, errors in one or more of the
plurality of data fragments are corrected and the data fragments
are restored in their entirety such that all data fragments are
accessible.
Repairing Reed-Solomon (14, 10) in GF(2.sup.8)
[0253] Another embodiment is directed to providing an algorithm to
repair a single node failure in the (14,10) Reed-Solomon code over
GFL(2.sup.8) in a system (e.g., Facebooks f4 system). Embodiments
are directed to an algorithm that can repair any failed node and
requires only 4 bits from each remaining node
Repair Algorithm for RS (14, 10) Code
[0254] Notation: [a] represents {1, 2, . . . , a} for a positive
integer a.
[0255] The RS (14, 10) code has n=14 codeword symbols and k=10
information symbols. In a storage system, different symbols are
stored in different nodes. The symbols are over the finite field
=GF(2.sup.8)={0,1,.beta.,.beta..sup.2, . . . .beta..sup.254}, where
.beta. is the root of 1+x.sup.2+x.sup.3+x.sup.4+x.sup.8=0. For the
given information symbols u.sub.j.di-elect cons., j=0,1, . . . ,
19, let f be the polynomial
f ( x ) = j = 0 9 u j x j . ##EQU00169##
[0256] RS codeword symbols are evaluations of the polynomial f and
erasures can be corrected using interpolation. By restricting the
evaluation points to a subfield, one obtains a low repair bandwidth
given the field size. Consider the subfield
=GF(2.sup.4)={0,1,.beta..sup.17,.beta..sup.172, . . . ,
.beta..sup.1714} of F. A set of evaluation points is chosen from E,
and denote it by A={.alpha..sub.1,.alpha..sub.2, . . . ,
.alpha..sub.14}={1,.beta..sup.17,.beta..sup.172,.beta..sup.173, . .
. , .beta..sup.1713}. Then, the 14 codeword symbols, in the 14
nodes, are
{ N m = f ( .alpha. m ) = j = 0 .alpha. u j .alpha. m j : .alpha. m
.di-elect cons. A } . ##EQU00170##
[0257] Algorithm steps are described below according to one or more
embodiments. Assuming the node N=f(.alpha.*)fails,
.alpha.*.di-elect cons.A. The transmitteds ymbols from the other
nodes (called helpers) are related to the failed node symbol
through the dual codeword c1, c2, . . . , c14) of RS(14,10):
c * N * = m = 1 , m .noteq. * 14 c m N m . ##EQU00171##
[0258] Table V shows column multipliers v.sub.m corresponding to
A={1,.beta..sup.17,.beta..sup.172, . . . .beta..sup.1713}
TABLE-US-00005 TABLE V .nu..sub.1 .nu..sub.2 .nu..sub.3 .nu..sub.4
.nu..sub.5 .nu..sub.6 .nu..sub.7 .nu..sub.8 .nu..sub.9 .nu..sub.10
.nu..sub.11 .nu..sub.12 .nu..sub.13 .nu..sub.14 .beta. .beta. 1
.beta. .beta..sup.221 .beta..sup.34 .beta..sup.238 .beta..sup.136
.beta..sup.238 .beta..sup.221 .beta..sup.102 .beta..sup.102
.beta..sup.34 1 indicates data missing or illegible when filed
[0259] Table VI shows a dual basis for each node.
TABLE-US-00006 TABLE VI Failed node N.sub.1 N.sub.2 N.sub.3 N.sub.4
N.sub.5 N.sub.6 N.sub.7 N.sub.8 N.sub.9 N.sub.10 N.sub.11 N.sub.12
N.sub.13 N.sub.14 .gamma..sub.1 .beta..sup.203 .beta..sup.118
.beta. .beta..sup.203 .beta..sup.33 .beta..sup.220 .beta..sup.16
.beta..sup.118 .beta..sup.16 .beta..sup.33 .beta..sup.152
.beta..sup.152 .beta..sup.220 .beta..sup.254 .gamma..sub.2
.beta..sup.152 .beta. .beta..sup.203 .beta. .beta..sup.237
.beta..sup.169 .beta. .beta..sup.67 .beta..sup.220 .beta..sup.237
.beta..sup.101 .beta..sup.101 .beta..sup.169 .beta. .gamma..sub.3
.beta. .beta. .beta..sup.135 .beta..sup.83 .beta..sup.169 .beta.
.beta..sup.152 .beta. .beta. .beta..sup.169 .beta..sup.33
.beta..sup.33 .beta..sup.101 .beta. .gamma..sub.4 .beta. .beta.
.beta. .beta..sup.16 .beta..sup.101 .beta..sup.33 .beta. .beta.
.beta. .beta..sup.103 .beta..sup.220 .beta..sup.320 .beta..sup.33
.beta. .gamma..sub.5 .beta..sup.187 .beta..sup.102 .beta..sup.238
.beta. .beta..sup.17 .beta..sup.204 1 .beta..sup.102 1
.beta..sup.17 .beta..sup.136 .beta..sup.136 .beta..sup.204 .beta.
.gamma..sub.6 .beta..sup.188 .beta. .beta..sup.187 .beta.
.beta..sup.221 .beta..sup.153 .beta..sup.204 .beta. .beta..sup.204
.beta. .beta..sup.88 .beta..sup.85 .beta..sup.153 .beta..sup.187
.gamma..sub.7 .beta. .beta..sup.238 .beta. .beta. .beta..sup.183
.beta..sup.85 .beta..sup.136 .beta..sup.238 .beta..sup.138
.beta..sup.183 .beta..sup.17 .beta..sup.17 .beta..sup.85
.beta..sup.119 .gamma..sub.8 1 .beta..sup.170 .beta. 1 .beta.
.beta..sup.17 .beta. .beta..sup.170 .beta..sup.68 .beta..sup.88
.beta. .beta. .beta..sup.17 .beta. indicates data missing or
illegible when filed
[0260] In particular, a dual codeword corresponds to evaluations of
a polynomial over F of degree at most nk1=3. Here the evaluations
of 8 appropriate dual codeword polynomials are found in Step 1. In
each helper, at most 4 evaluations are linearly independent,
leading to the bandwidth reduction (Step 2). As in Step 3, we map
symbols of F to sub-symbols of the base field, B=GF (2), using the
trace function: (x)=x+x.sup.2+x.sup.4 . . . +x.sup.128, where for
every x.di-elect cons., (x).di-elect cons.. Taking trace on both
sides of (2), one obtains 8 trace functions of the failed symbol in
Step 4. Finally in Step 5, the following property is used to repair
the failed symbol. Let {.gamma..sub.1,.gamma..sub.2, . . . ,
.gamma..sub.8} and {.gamma..sub.1',.gamma..sub.2', . . . ,
.gamma..sub.8'} be dual basis for F over B, then any symbol
x.di-elect cons.F can be reconstructed from its 8 trace functions:
x=.SIGMA..sub.i=1.sup.s.gamma..sub.i'(.gamma..sub.ix)
[0261] Repair algorithm for RS (14, 10) over GF(28:
[0262] Step 1: For helper node m, such that .alpha..sub.m.di-elect
cons.A\{.alpha.*} evaluate 8 polynomials at .alpha..sub.m
{.nu..sub.m.eta..sub.tp.sub.j(.alpha..sub.m), t.di-elect cons.[2],
j.di-elect cons.[4]}, where .nu.ms are the column multipliers and
provided in Table V,
{ .eta. 1 , .eta. 2 } = { 1 , .beta. } , and p j , * ( x ) = .xi. j
w .di-elect cons. W ( x - .alpha. * + w - 1 .xi. j ) , .xi. j =
.beta. 17 ( j - 1 ) , W = { 1 , .beta. 17 , .beta. 68 } .
##EQU00172##
[0263] Step 2: Polynomial evaluations calculated in Step 1 from the
same helper are in a subspace of F with dimension 4. Let us
represent a basis of this 4-dimensional subspace as {sm1, sm2, sm3,
sm4}. If the result of Step 1 is 4 unique values, directly
represent them as {sm1, sm2, sm3, sm4}. If there are more than 4
unique values, there are many different ways to find a basis of the
subspace. One general way is to view the results as vectors over B8
and write them in an 8.times.8 matrix over B. Then, one can apply
the row reduction method to derive a basis in the first 4 nonzero
rows.
[0264] Step 3: Download the binary values
D.sub.m,j=*(s.sub.mjN.sub.m), j.di-elect cons.[4] from node m.
[0265] Step 4: Since the trace function is a linear function and is
in the subspace constructed by {sm1, sm2, sm3, sm4}, we use the
information downloaded from the helpers to represent.
(.nu..sub.m.eta..sub.tp.sub.j*(.alpha..sub.m)N.sub.m) and then
calculate (.nu..sub.*.eta..sub.tp.sub.j,*(.alpha..sub.*)N.sub.*),
i.di-elect cons.[2], j.di-elect cons.[4] from
tr / ( .upsilon. * .eta. t p j , * ( .alpha. * ) N * ) = - m = 1 ,
m .noteq. * 14 tr / ( .upsilon. m .eta. t p j , * ( .alpha. m ) N m
) . ##EQU00173##
[0266] Altogether, 8 bits are calculated for the failed node
N*.
[0267] Step 5:
{.nu..sub.s.eta..sub.tp.sub.j,*(.alpha..sub.*):t.di-elect cons.[2],
j.di-elect cons.[4]} forms a basis for F over B. Let us present
these 8 values as the set {.gamma.1,.gamma.2, . . . , .gamma.8}.
Then, we use the 8 bits calculated in Step 4 to reconstruct N.sub.*
by
N * = i = 1 8 .gamma. i ' tr / ( .gamma. i N * ) , ##EQU00174##
where {.gamma..sub.1',.gamma..sub.2', . . . , .gamma..sub.8'} is
the dual basis of {.gamma..sub.1, .gamma..sub.2, . . . ,
.gamma..sub.8} for F over B, and are provided in Table VI.
Example
[0268] According to one embodiment, a detailed repair algorithm
example is provided in which the failed node is N1.
[0269] Step 1: The 8 polynomials
.nu..sub.m.eta..sub.tp.sub.j,1(.alpha..sub.m), t.di-elect cons.[2],
j.di-elect cons.[4] are provided in Table VII.
TABLE-US-00007 TABLE VII THE COEFFICIENTS OF DIFFERENT HELPERS WHEN
N.sub.2 FAILS. helper node N.sub.2 N.sub.3 N.sub.4 N.sub.5 N.sub.6
N.sub.7 N.sub.8 N.sub.9 N.sub.10 N.sub.11 N.sub.12 N.sub.13
N.sub.14 = .beta..sup.17 1 0 1 .beta. .beta. .beta..sup.102 .beta.
.beta..sup.17 .beta..sup.17 .beta. 0 .beta..sup.119 1(t = 1, j = 1)
= .beta..sup.17 .beta..sup.119 .beta..sup.68 0 .beta..sup.68 .beta.
.beta..sup.203 .beta..sup.85 .beta. .beta..sup.51 0 .beta..sup.102
.beta. 2(t = 1, j = 2) = .beta..sup.17 1 .beta. 0 .beta..sup.170 0
.beta..sup.68 0 .beta..sup.17 .beta..sup.51 .beta..sup.238
.beta..sup.136 .beta..sup.17 3(t = 1, j = 3) = .beta..sup.119
.beta..sup.110 0 .beta..sup.119 .beta. .beta..sup.187
.beta..sup.203 0 .beta. .beta..sup.51 .beta..sup.17 .beta..sup.238
.beta..sup.17 4(t = 1, j = 4) = .beta..sup.18 0 .beta.
.beta..sup.188 .beta..sup.103 .beta..sup.239 .beta..sup.18 .beta.
.beta. 0 .beta. 5(t = 2, j = 1) = .beta..sup.18 .beta..sup.120
.beta..sup.69 0 .beta..sup.69 .beta..sup.69 .beta..sup.205
.beta..sup.85 .beta. .beta..sup.52 0 .beta..sup.103 .beta. 6(t = 2,
j = 2) = .beta..sup.18 .beta..sup.205 0 .beta..sup.171 0 .beta. 0
.beta..sup.18 .beta..sup.52 .beta. .beta..sup.137 .beta..sup.18 7(t
= 2, j = 3) = .beta..sup.120 .beta..sup.120 0 .beta. .beta. .beta.
.beta. 0 .beta..sup.62 .beta..sup.52 .beta. .beta. .beta..sup.18
8(t = 2, j = 4) indicates data missing or illegible when filed
TABLE-US-00008 TABLE VIII THE COEFFICIENTS OF DIFFERENT HELPERS
WHEN N.sub.1 FAILS. helper node N.sub.2 N.sub.3 N.sub.4 N.sub.5
N.sub.6 N.sub.7 N.sub.8 N.sub.9 N.sub.10 N.sub.11 N.sub.12 N.sub.13
N.sub.14 s.sub.m1 .beta..sup.17 1 .beta. 1 .beta. .beta. .beta.
.beta..sup.85 .beta..sup.17 .beta..sup.17 .beta. .beta..sup.102
.beta..sup.119 s.sub.m2 .beta..sup.18 .beta. .beta..sup.69 .beta.
.beta. .beta..sup.69 .beta..sup.204 .beta..sup.86 .beta.
.beta..sup.18 .beta..sup.238 .beta. .beta..sup.238 s.sub.m3
.beta..sup.119 .beta..sup.119 .beta..sup.204 .beta..sup.35
.beta..sup.195 .beta..sup.187 .beta..sup.103 .beta..sup.238
.beta..sup.18 .beta..sup.51 .beta..sup.120 .beta..sup.103
.beta..sup.120 s.sub.m4 .beta..sup.120 .beta..sup.120
.beta..sup.205 .beta..sup.69 .beta..sup.116 .beta..sup.188
.beta..sup.203 .beta..sup.239 .beta. .beta..sup.52 .beta..sup.239
.beta..sup.137 .beta..sup.239 indicates data missing or illegible
when filed
TABLE-US-00009 TABLE IX OF DIFFERENT HELPERS WHEN N FAILS. helper
node N.sub.2 N.sub.3 N.sub.4 N.sub.5 N.sub.6 N.sub.7 N.sub.8
N.sub.9 N.sub.10 N.sub.11 N.sub.12 N.sub.13 N.sub.14 = 1(t = 1, j =
1) D.sub.2,1 D.sub.3,1 0 D.sub.5,1 D.sub.6,1 D.sub.7, D.sub.8,1
D.sub.9,3 D.sub.10,1 D.sub.11,1 D.sub.12,1 0 D.sub.14,1 = 2(t = 1,
j = 2) D.sub.2,1 D.sub.3,3 D.sub.4,1 0 D.sub.6,2 D.sub.7,1
D.sub.8,2 D.sub.9,1 D.sub.10,2 D.sub.11,3 0 D.sub.13,1 D.sub.14,2 =
3(t = 1, j = 3) D.sub.2,1 D.sub.3,1 D.sub.4,3 0 D.sub.6,1 + 0
D.sub.8,1 + 0 D.sub.10,3 D.sub.11,3 D.sub.12,2 D.sub.13,2
D.sub.14,1 + D.sub.6,2 D.sub.8,2 0 D.sub.14,2 = 4(t = 1, j = 4)
D.sub.2, D.sub.3,3 0 D.sub.5,3 D.sub.6,1 D.sub.7,3 D.sub.8,2 0
D.sub.10,1 + D.sub.11,3 D.sub.12,1 + D.sub.13,1 + D.sub.14,1 +
D.sub.10,3 D.sub.12,2 D.sub.13,2 D.sub.14,2 = 5(t = 2, j = 1)
D.sub.2,3 D.sub.3,2 0 D.sub.5,2 D.sub.6, D.sub.7, D.sub.8,
D.sub.9,4 D.sub.10,3 D.sub.11,2 D.sub.12, 0 D.sub.14,2 = 6(t = 2, j
= 2) D.sub.2,2 D.sub.3,4 D.sub.4,2 0 D.sub.6,3 D.sub.7,2 D.sub.8,3
D.sub.9,2 D.sub.10,4 D.sub.11,4 0 D.sub.13,3 D.sub.14,1 + = 7(t =
2, j = 3) D.sub.2,3 D.sub.3,2 D.sub.4,4 0 D.sub.6,3 + 0 D.sub.8,3 +
0 D.sub.10,3 D.sub.11,4 D.sub.12,4 D.sub.13,4 D.sub.14,2 D.sub.6,4
D.sub.8,4 = 8(t = 2, j = 4) D.sub.2,4 D.sub.3,4 0 D.sub.5,4
D.sub.6,3 D.sub.7,4 D.sub.8,4 0 D.sub.10,2 + D.sub.11,1 D.sub.12,3
+ D.sub.13,3 + D.sub.14,3 + D.sub.10,4 D.sub.12, D.sub.13,4
D.sub.14,2 indicates data missing or illegible when filed
[0270] Step 2: (sm1, sm2, sm3, sm4) are tabulated in Table
VIII.
[0271] Step 3: Download (s.sub.mjN.sub.m), j.di-elect cons.[4].
[0272] Step 4:
{=(.nu..sub.m.eta..sub.tp.sub.j,1(.alpha..sub.m)N.sub.m),
t.di-elect cons.[2], j.di-elect cons.[4]} are calculated in Table
VIX, and {(.nu..sub.1.eta..sub.1p.sub.i,1(1)N.sub.1), t.di-elect
cons.[2], j.di-elect cons.[4]} are calculated from (3).
[0273] Step 5: We can calculate N1 from (4) and Table VI
[0274] As such a detailed algorithm is provided to implement the
repair of the RS(14, 10) code over GF(2.sup.8) in Facebooks f4
system. Our algorithm can repair one failed node and requires only
4 bits from each helper.
[0275] While this disclosure has been particularly shown and
described with references to exemplary embodiments thereof, it will
be understood by those skilled in the art that various changes in
form and details may be made therein without departing from the
scope of the claimed embodiments.
* * * * *