MIT LCS TR 731.ps 731
MIT-LCS-TR-731 MIT-LCS-TR-731
User Manual: MIT-LCS-TR-731
Open the PDF directly: View PDF
.
Page Count: 16
| Download | |
| Open PDF In Browser | View PDF |
Invited paper: 17th Conference on Foundations of Software Technology and Theoretical
Computer Science, Kharagpur, India, 1997.
Algorithmic issues in coding theory
Madhu Sudan
October 9, 1997
Abstract
The goal of this article is to provide a gentle introduction to the basic definitions, goals and constructions in coding theory. In particular we focus on
the algorithmic tasks tackled by the theory. We describe some of the classical
algebraic constructions of error-correcting codes including the Hamming code,
the Hadamard code and the Reed Solomon code. We describe simple proofs
of their error-correction properties. We also describe simple and ecient algorithms for decoding these codes. It is our aim that a computer scientist with
just a basic knowledge of linear algebra and modern algebra should be able to
understand every proof given here. We also describe some recent developments
and some salient open problems.
1 Introduction
Error-correcting codes are combinatorial structures that allow for the transmission of information over a noisy channel and the recovery of the information
without any loss at the receiving end. Error-correcting codes come in two basic
formats. (1) The \block error-correcting code": Here the information is broken
up into small pieces. Each piece contains a xed nite amount of information.
The encoding method is applied to each piece individually (independently). The
resulting encoded pieces (or blocks) are sent over the noisy channel. (2) The
\convolutional codes": Here the information is viewed as a potentially in nite
stream of bits and the encoding method is structured so as to handle an in nite
stream. This survey will be restricted to the coverage of some standard block
error-correcting codes.
Formally a block error-correcting code may be speci ed by an encoding function C. The input to C is a message m, which is a k-letter string over some alphabet (typically = f0; 1g but we will cover more general codes as well). E
1
maps m into a longer n-letter string over the same alphabet1. The mapped string
is referred to as a codeword. The basic idea is that in order to send the message
m over to the receiver, we transmit instead the codeword C(m). By the time this
message reaches the destination it will be corrupted, i.e., a few letters in C(m)
would have changed. Say the received word is R. Hopefully R will still be able to
convey the original message m even if it is not identically equal to C(m). The only
way to preserve this form of redundancy is by ensuring that no two codewords are
too \close" to each other. This brings us to the important notion of \close"ness
used, namely the Hammingdistance. The Hamming distance between two strings
x; y 2 n , denoted (x; y), is the number of letters where x and y di er. Notice that forms a metric, i.e., (x; y) = 0 ) x = y, (x; y) = (y; x) and
(x; y) + (x; z) (x; z). A basic parameter associated with a code is its
distance i.e., the maximum value d such that any two codewords are a Hamming
distance of at least d apart. Given a code of distance d and a received word R that
di ers from C(m) in at most e d , 1 places, the error in the transmission can be
detected. Speci cally, we can tell that some letter(s) has been corrupted in the
transmission, even though we may not know which letters are corrupted. In order
to to actually correct errors we have to be able to recover m uniquely based on R
and a bound t on the number of errors that may have occurred. To get the latter
property t has to be somewhat smaller than d , 1. Speci cally if t b(d , 1)=2c,
then we notice that indeed there can be at most one message m such that
(C(m); R) t. (If m1 and m2 both satisfy (C(m1 ); R); (C(m2); R) t,
then (C(m1); C(m2 )) (m1 ; R) + (R; m2 ) 2t d , 1, contradicting the
distance of C.) Thus in an information theoretic sense R maintains the information contained in m. Recovering the information m eciently from C is another
matter and we will come back to this topic presently.
To summarize the discussion above we adopt the following terse notation that
is standard in coding theory. A code C is an [n; k; d]q code if C : k ! n , where
j j = q with minx;y2 k f(C(x); C(y))g = d. With some abuse of notation we
will use C to denote the image of the map C (i.e., C may denote the collection of
codewords rather than the map). C is called a e-error-detecting code for e = d , 1
and a t-error correcting code for t = b(d , 1)=2c.
In the remaining sections of this article we will describe some common constructions of [n; k; d]q for various choices of the parameters n; k; d and q. We
will also describe the algorithmic issues motivated by these combinatorial objects and try to provide some solutions (and summarize the open problems).
(We assume some familiarity with algebra of nite elds [10, 19].) Before going
on to these issues, we once again stress the importance of the theory of errorcorrecting codes and its relevance to computer science. The obvious applications
of error-correcting codes are to areas where dealing with error becomes important
such as storage of information on disks, CDs, and communication over modems
1
The assumption that the message is a k-letter string over is just made for notational convenience. As it will become obvious, the representation of the message
space is irrelevant to the communication channel. The representation of the encoded
string is however very relevant!
etc. Additionally, and this is where they become important to the theoretical
computer scientist, error-correcting codes come into play in several ways in complexity theory | for example, in fault-tolerant computing, in cryptography, in
the derandomization of randomized algorithms and in the construction of probabilistically checkable proofs. In several of these cases it is not so much the nal
results as the notions, methods and ingredients from coding theory that help. All
of this makes it important that a theoretical computer scientist be comfortable
with the methods of this eld | and this is the goal of this article. A reader
interested in further details may try one of the more classical texts [2, 11, 17].
Also, the article of Vardy [18] is highly recommended for a more detailed account
of progress in coding theory. The article is also rich with pointers to topics of
current interest.
2 Linear Codes
While all questions relating to coding theory can be stated in general, we will
focus in our article on a subset of codes called linear codes. These codes are obtained by restricting the underlying alphabet to be a nite eld of cardinality
q with binary operations \+" and \". Thus a string in n can be thought of as
a vector in n-dimensional space, with induced operations \+" (vector addition),
and \" (scalar multiplication). Thus a code C n is now a subset of the
vectors. If this subset of vectors forms a \subspace" then the code is linear, as
made formal below:
De nition1. C n is a linear code if 8a 2 ; x; y 2 C , x + y; a x 2 C .
Many of the parameters of error-correcting codes become very clean in the
case of linear codes. For instance, how does one specify a code C 2 n ? For
general codes, succinct representations may not exist! However, for every linear
code a succinct representation, of size polynomial in n does exist! In particular,
we have the following two representations:
1. For every [n; k; d]q linear code C there exists an n k \generator" matrix
G = GC with entries from such that C = fGxjx 2 k g.
2. For every [n; k; d]q code C there exists an (n , k) n parity check matrix
H = HC over such that C = fy 2 n s.t. Hy = 0g.
Conversely, the following hold: Every n k matrix G over de nes an
[n; k0; d]q code, for some d 1 and k0 k, CG having as codewords fGxjx 2 k g.
Similarly every (n , k) n matrix H de nes an [n; k0; d] code CH0 , for some d 1
and k0 k, having as codewords fy 2 n jHy = 0g.
Exercise:
1. Prove properties (1) and (2) above.
2. Given the generator matrix GC of a code C , give a polynomial time algorithm
to compute a parity check matrix HC for C .
3. Show that if G is of full column rank (H is of full row rank) then the code
CG (CH ) is an [n; k; d]q code.
3 Some common constructions of codes
In this section we describe some common construction of codes. But rst let us
establish the goal for this section. In general we would like to nd families of
[n; k; d]q codes for in nitely many triples (n; k; d) for some xed q. The property
we would really like is that k=n and d=n are bounded away from zero as n !
1. Such a code is termed asymptotically good and the two properties k=n >
0 and d=n > 0 are termed constant message-rate and constant distance-rate
respectively. Unfortunately we will not be able to get to this goal in this article.
But we will settle for what we term weakly good codes. These are codes with
polynomial message-rate, i.e., k = (n ) for some > 0 and constant distancerate.
3.1 Hamming code
Hamming codes are de ned for every positive n such that there exists an integer
l such that n = 2l , 1. Then the Hamming code of block size n over the alphabet
f0; 1g is given by an l n parity check matrix H Hmg whose columns are all the
distinct l-dimensional non-zero vectors. Notice that there are exactly 2l , 1 of
these.
Lemma 2. For every positive integer n such that n = 2l , 1 for some integer l,
the Hamming code of block size n is an [n; n , l; 3]2 code.
Proof Sketch. Notice that the rank of H Hmg is l. In particular the column vectors
containing exactly one 1 are linearly independent and there are l of them. Thus
we nd that the Hamming code is an [n; k; d]2 code for k = n , l.
We now move to showing that the distance of the Hamming code is 3. Notice
that the code has no elements of weights since this would imply that two vectors
in the parity check matrix are identical. This implies the distance is at least
3. Now consider any two column vectors v1 and v2 in H Hmg. Notice that the
vector v1 + v2 is also a column vector of H Hmg and is distinct from v1 and
v2 . Now consider the n dimensional vector which is zero everywhere except in
the coordinates corresponding to the vectors v1; v2 and v1 + v2. This vector has
weight 3 and is easily seen to be an element of the Hamming code. Thus the
distance of the Hamming code is exactly 3.
The Hamming code is a simple code with a very good rate. Unfortunately
it can only correct 1 error, de nitely far from our goal of constant error-rate.
Next we move on to a code with good error-correcting properties, but with very
low-rate.
3.2 Hadamard code
A Hadamard matrix is an n n matrix M with entries from 1 such that
MM T = n In where In is the n n identity matrix. A Hadamard matrix
immediately leads to an error correcting code where the rows of M are the
codewords. This leads to a codeword over the alphabet = f+1; ,1g. We prove
the distance property of the code rst.
Lemma 3. If M is a Hadamard matrix then any two rows agree is exactly n=2
places.
Proof. Say the rows of interest are the ith and jth rows. Then consider the
element (MM T )ij . This element is the sum of n terms, with the kth term being
mik mjk . Notice that this term evaluates to +1 if mik = mjk and ,1 otherwise.
Thus if the ith and jth rows disagree in t places, then (MM T )ij = (n , t) + t.
Since (MM T )ij = 0, we have that n , 2t = 0 and hence the two rows (dis)agree
in exactly n=2 places.
Thus the task of constructing a Hadamard code reduces to the task of constructing Hadamard matrices. Constructions of Hadamard matrices have been
a subject of much interest in combinatorics. It is clear (from Lemma 3) that for
an n n Hadamard matrix to exists n must be even. The converse is not known
to be true and is still an open question. What is known is that an n n matrix
exists for every n of the form p , 1 where p is a prime. It is also known that if an
n1 n1 Hadamard matrix exists and an n2 n2 Hadamard matrix exists, then
an n1n2 n1 n2 matrix exists. Many other such constructions are also known but
not all possibilities are covered yet. Here we give the basic construction which
applies when n is a power of 2. These constructions are described recursively as
follows:
2
M1Hdm = 4
+1 +1
+1 ,1
3
2
5
MlHdm = 4
Hdm
+MlHdm
,1 +Ml,1
3
5:
+Ml,1 ,Ml,1
Lemma 4. For every l, the rows of MlHdm form a [2l ; l; 2l,1]2 code.
Hdm
Hdm
Proof. Left as an exercise to the reader.
The Hadamard codes maintaina constant distance-rate. However their messagerate approaches zero very quickly. Next we describe a code with constant messagerate and distance-rate. The catch is that the code uses an alphabet of growing
size.
3.3 Reed Solomon code
The Reed Solomon codes are a family of codes de ned over an alphabet of
growing size, with n q. The more common de nition of this code is not (we feel)
as intuitive or as useful as the \folklore" de nition. We present both de nitions
here, starting with the more useful one and then show the equivalence of the
two.
De nition5 (Reed Solomon codes). Let be a eld of size q, n q and let
x0 ; : : :; xn,1 be some xed enumeration of n of the elements of . (It is standard
to pick n = q , 1 and xi = i for some primitive element 2. Then for every
1
k n, the Reed Solomon code CRS
m=
;n;k;q is de ned as follows: A message
Pk,1
m0 : : :mk,1 corresponds to the degree k , 1 polynomial M(x) = i=0 mi xi .
1
The encoding of m, is CRS
;n;k;q (m) = c0 : : :cn,1 where cj = M(xj ).
The distance properties of the Reed Solomon codes follow immediately from
the fact that a degree k , 1 polynomial may only have k , 1 zeroes unless all of
its coecients are zero.
1
Lemma 6. For every n q and k n, the Reed Solomon code CRS
;n;k forms
an [n; k; n , k]q linear code.
Proof. The fact that the code is linear follows from the fact that if M0 (x) and
M1 (x) are polynomials of degree at most k , 1 then so is M0 (x) + M1 (x). The
distance follows from the fact that if M0 (xj ) = M1 (xj ) for k values of j then
M0 M1 (or equivalently if M0 (xj ) , M1 (xj ) is zero for k values of j, then
M0 , M1 is the zero polynomial).
Finally for the sake of completeness we present a second de nition of Reed
Solomon codes. This de nition is more commonly seen in the texts, but we feel
this part may be safely skipped at rst reading.
De nition7 (Reed Solomon codes). Let be a eld of size q with primitive
element , and let n = q , 1, k n. Let Pk;q (x) be the polynomial (x , )(x ,
2 ) (x , n,k). The Reed Solomon code C 2
RS;n;k;q is de ned as follows: A
message
m
=
m
:
:
:m
corresponds
to
the
degree
k , 1 polynomial M(x) =
0
k
,
1
Pk,1
i . The encoding of m, is C 1
m
x
(m)
=
c
i
0 : : :cn,1 where cj is the
i=0
RS;n;k;q
coecient of xj in the polynomial Pk;q (x)M(x).
Viewed this way it is hard to see the correspondence between the two de nitions (or the distance property). We prove an equivalence next.
Lemma 8. The de nitions of Reed Solomon codes given in De nitions 5 and 7
coincide for n = q , 1 and the standard enumeration of the elements of GF(q).
Proof. Notice that it suces to prove that every codeword according to the rst
de nition is a codeword according to the second de nition. The fact that the
sets are of the same size implies that they are identical.
1
Consider the encoding of m = m0 : : :mk,1 . This encoding is CRS
;n;k;q =
Pk,1
i
j
c0 : : :cn,1 with ci = j =0 mj ( ) . To show that this is a codeword P
according
,1 ci xi
to the second de nition we need to verify that the polynomial C(x) = ni=0
2
is a primitive element of the eld GF(q) if
j
6= 1 for any j < q , 1.
has (x , l ) as a factor for every l 2 f1; : : :; n , kg. Equivalently it suces to
verify that C( l ) = 0, which we do next:
n,1
X
C( l ) = ci ( l )i
=
i=0
nX
,1 kX
,1
mj ( i )j ( l )i
i=0 j =0
kX
,1 nX
,1
= mj ( j l )i
j =0
i=0
q
,2
kX
,1 X
i
= mj
j;l
j =0 i=0
where j;l = j +l . Notice that for every j; l s.t. j + P
l 6= q , 1, j;l 6= 1.
,2 i = 03. Since
Notice further that for every such j;l the summation qi=0
j;l
j 2 f0; : : :; k , 1g, we nd that j;l 6= 1 for every l 2 f1; : : :; q , 1 , kg. Thus for
every l 2 f1; : : :; n , kg, we nd that C( l ) = 0. This concludes the proof.
3.4 Multivariate polynomial codes
The next family of codes we describe are not very commonly used in coding
theory, but have turned out to be fairly useful in complexity theory and in
particular in the results on probabilistically checkable proofs. Surprisingly these
codes turn out to be a common generalization of Hadamard codes and Reed
Solomon codes!
De nition9 (Multivariate polynomial code). For integer parameters m; l
and q with l < q, the multivariate polynomial code Cpoly;m;l;q has
P as message a string of coecients m = fmi1 ;i2 ;:::;im g with ij 0 and j ij l.
This
sequence is interpreted as the m-variate polynomial M(x1 ; : : :; xm) =
P
i1
i
i1 ;:::;ij mi1 ;:::;ij x1 xmm . The encoding of m is the string of letters fM(x1; : : :; xm )g
with one letter for every (x1 ; : : :; xm) 2 m .
Obviously the multivariate polynomial codes form a generalization of the
Reed Solomon codes (again using the rst de nition given here of Reed Solomon
codes). The distance property of the multivariate polynomial codes follow also
from the distance property of multivariate polynomials (cf. [5, 13, 21]).
Lemma 10. For integers
m;l and q with l < q, the code Cpoly;m;l;q is an [n; k; d]q
,
code with n = qm , k = mm+l and d = (q , l)qm,1 .
3
This identity is obtained as follows: Recall that Fermat's little theorem asserts that
q ,1
, 1 = 0 for every non-zero
P ,2 i in GF(q). Factoring the left hand side, we nd
that either , 1 = 0 or qi=0
= 0. Since 6= 1, the latter must be the case.
Proof. The bound
The fact that the number of coecients
,
P on n is immediate.
i1 ; : : :; im s.t. j ij l is at ml+l is a well-known exercise in counting. Finally
the bound on the distance follows from the fact a degree l polynomial can only
be zero for l=q fraction of its inputs. (This is an easy inductive argument based
on the number of variables. The base case is well known and inductively one
picks a random assignment to the variables x1; : : :; xm,1 and argues that the
resulting polynomial in xm is non-zero with high probability. Finally one uses
the base case again to conclude that the nal polynomial in xm is left non-zero
by a random assignment to xm .)
1
It is easy to see that the code CRS
;q;k;q is the same as the code Cpoly;1;k,1;q.
Also notice that the code Cpoly;m;1;2 forms an [2m ; m; 2m,1]2 code, same as
parameters of the Hadamard code given by the rows of MmHdm. It turns out that
these two codes are in fact identical. The proof is left as an exercise to the reader.
3.5 Concatenated codes
Each code in the collection of codes we have accumulated above has some aw
or the other. The Hamming codes don't correct too many errors, the Hadamard
codes are too low-rate, and the Reed Solomon codes depend on a very large
alphabet. Yet it turns out it is possible to put some of these codes together
and obtain a code with reasonably good behavior (\polynomially good"). This
is made possible by a simple idea called \concatenation", de ned next.
De nition11 (Concatenation of codes). Let C1 be an [n1; k1; d1]q1 code over
the alphabet 1 and let C2 be an [n2; k2; d2]q2 code over the alphabet 2 . If
q1 = q2k2 then the code C1 C2 is de ned as follows: Associate every letter in
1 with a codeword of C2 . Encode every message rst using the code C1 and
then encode every letter in the encoded string using the code C2. More formally, given a message m 2 1k1 = 2k1 k2 , let C1(m) = c1 : : :cn1 2 1n1 . The
encoding C1 C2 (m) is given by c11 : : :c1n2 c21 : : :cn1 n2 2 2n1 n2 , where for every
i 2 f1; : : :; n1g, ci1 : : :cin2 = C2 (ci ).
Almost immediately we get the following property of concatenation.
Lemma 12. If C1 is an [n1; k1; d1]q1 code and if C2 is an [n2; k2; d2]q2 code with
q1 = q2k2 , then C1 C2 is an [n1n2; k1k2; d0]q2 code, for some d0 d1d2.
Proof. The block size and message size bounds follow from the de nition. To
see the distance property, consider two messages m1 ; m2 2 1k1 . For l 2 f1; 2g,
let cl1 : : :cln1 be the encoding of ml using C1 and let cl11 : : :cln1n2 be its encoding
using C1 C2. Notice that there must exist at least d1 values of i such that c1i 6= c2i
(by the distance of C1). For every such i, there must exist at least d2 values of j
such that c1ij 6= c2ij (by the distance of C2 ). Thus we nd that C1 C2 (m1 ) and
C1 C2 (m2 ) di er in at least d1 d2 places.
To best see the power of concatenation, consider the following simple application: Let C1 be a Reed Solomon code with q = 2m , n = q and k = :4n.
I.e., C1 is an [n; :4n; :6n]2m code with n = 2m . Let C2 be the Hadamard code
[2m; m; 2m,1 ]2. The concatenation C1 C2 is an [n2; :4n logn; :3n2]2 code. I.e., the
resulting code has constant distance-rate, polynomial rate and is over the binary
alphabet! Thus this satis es our weaker goal of obtaining a weakly-good code.
Even the goal of obtaining an asymptotically good code is close now. In particular, the code of Justesen is obtained by an idea similar to that of concatenation.
Unfortunately we shall not be able to cover this material in this article.
4 Algorithmic tasks
We now move on to the algorithmic tasks of interests: The obvious rst candidate
is encoding.
Problem 13 (Encoding).
n k matrix G and message m 2 k .
Output: C (m), where C = CG is the code with G as the generator matrix.
Input:
It is clear that the problem as speci ed above is easily solved in time O(nk)
and hence in time polynomial in n. For speci c linear codes such as the Reed
Solomon codes it is possible to encode the codes faster, in time O(n logc n) for
some constant c. However till recently no asymptotically good code was known
to be encodable in linear time. In a recent breakthrough. Spielman [15] presented
the rst known code that is encodable in linear time. We will discuss this more
in a little bit.
The next obvious candidate problem is the decoding problem. Once again
it is clear that if the received word has no errors, then this problem is only as
hard as solving a linear system and thus can be easily solved in polynomial time.
So our attention moves to the case where the received word has errors. We rst
de ne the error detection problem.
Problem 14 (Error detection).
n k generator matrix G for a code C = CG; and a received word R 2 n .
Output: Is R a codeword?
Input:
The error detection problem is also easy to solve in polynomial time. We nd
the parity check matrix H for the code C and then check if HR =0. We now
move to the problem of decoding in the presence of errors. This problem comes
in several variants. We start with the simple de nition rst:
Problem 15 (Maximum likelihood decoding).
n k generator matrix G for a code C = CG; and a received word R 2 n .
Find a codeword x 2 C , that is nearest to R in Hamming distance.
(Ties may be broken arbitrarily.)
Input:
Output:
There are two obvious strategies for solving the maximumlikelihood decoding
problem:
Brute Force 1: Enumerate all the codewords and nd the one that is closest
to R.
Brute Force 2: For t = 0; 1; : : :;, do: Enumerate all possible words within a
Hamming distance of t from R and check if the word is a codeword. Output the
rst match.
Despite the naivete of the search strategies above, there are some simple cases
where these strategies work in polynomial time. For instance, the rst strategy
above does work in polynomial time for Hadamard codes. The second strategy
above works in polynomial time for Hamming codes (why?). However, both
strategies start taking exponential time once the number of codewords becomes
large, while distance also remains large. In particular, for \asymptotically good"
or even \weakly good" codes, both strategies above run in exponential time.
One may wonder if this exponential time behavior is inherent to the decoding
problem. In perhaps the rst \complexity" result in coding theory, Berlekamp,
McEliece and van Tilborg [4] present the answer to this question.
Theorem 16 [4]. The Maximum likelihood decoding problem for general linear
codes is NP-hard.
There are two potential ways to attempt to circumvent this result. One
method is to de ne and solve the maximum likelihood decoding problem for
speci c linear codes. We will come to this question momentarily. The other hope
is that we attempt to correct only a limited number of errors. In order to do so,
we further parameterize the maximum likelihood decoding problem as follows:
Problem 17 (Bounded distance decoding).
n k generator matrix G for a code C = CG ; a received word R 2 n
and a positive integer t.
Output: Find any/all codewords in C within a Hamming distance of t from R.
Input:
The hardness result of [4] actually applies to the Bounded distance decoding
problem as well. However one could hope for a result of the form: \There exists an
> 0, such that for every [n; k; d]q linear code C , the bounded distance decoding
problem for C with t = d is solvable in polynomial time". One bottleneck to
such a general result is that we don't know how to compute d for a generic linear
code. This motivates the following problem:
Problem 18 (Minimum distance).
Input:
n k generator matrix G for a code C = CG and an integer parameter d.
Is the distance of C at least d?
Output:
This problem was conjectured to be coNP-hard in [4]. The problem remained
open for nearly two decades. Recently, in a major breakthrough, this problem
was shown to be coNP-complete by Vardy [18]. While this does not directly rule
out the possibility that a good bounded distance decoding algorithm may exist,
the result should be ruled as one more reason that general positive results may
be unlikely.
Thus we move from general results, i.e., where the code is speci ed as part
of the input, to speci c results, i.e., for well-known families of codes. The rst
question that may be asked is: \Is there a family of asymptotically-good [n; k; d]q
linear code and > 0, for which a polynomial time bounded distance decoding
algorithm exists for t d?" For this question the answer is \yes". A large number of algebraic codes do have such polynomial time bounded distance decoding
algorithms. In particular the Reed Solomon codes are known to have such a
decoding algorithm for t b(d , 1)=2c (cf. [2, 11, 17]). This classical result is
very surprising given the non-trivial nature of this task. This result is also very
crucial for many of the known asymptotically good codes, since many of these
codes are constructed by concatenating Reed Solomon codes with some other
codes. In the next section we shall cover the decoding of Reed Solomon codes in
more detail.
Lastly there is another class of codes, constructed by combinatorial means,
for which bounded distance decoding for some t d can be performed in
polynomial time. These are the expander codes, due to Sipser and Spielman [14]
and Spielman [15]. The results culminate in a code with very strong | linear
time (!!!) | encoding and bounded distance decoding algorithms. In addition
to being provably fast, the algorithms for the encoding and decoding of these
codes are surprisingly simple and clean. However, the description of the codes
and analysis of the algorithm is somewhat out of the scope of this paper. We
refer the reader to the original articles [14, 15] for details.
5 Decoding of Reed Solomon code
As mentioned earlier a polynomialtime algorithm for bounded distance decoding
is known and this algorithm corrects up to t b(d , 1)=2c errors. Notice that
this coincides exactly with the error-correction bound of the code (i.e., a Reed
Solomon code of distance d is a t-error-correcting code for t = b(d , 1)=2c). This
bound on the correction capability is inherent, if one wishes to determine the
codeword uniquely. However in the bounded distance decoding problem we do
allow for multiple solutions. Given this latitude it is reasonable to hope for a
polynomial-time decoding algorithm that corrects more errors - say up to t <
(1,)d where is some xed constant. However no such algorithm is known for all
possible values of (n; k; d = n , k). Recently, in [16], we presented an algorithm
which does correct up to (1 , )d errors, provided k=n ! 0. This algorithm
was inspired by an algorithm of Welch and Berlekamp [20, 3] for decoding Reed
Solomon codes. This algorithm is especially clean and elegant. Our solution uses
similar ideas to correct even more errors and we present this next.
Notice rst that the decoding problem for Reed Solomon codes can be solved
by solving the following cleanly stated problem:
Problem 19 (Reed Solomon decoding).
n pairs of points f(xi ; yi)g, xi; yi 2 GF(q); and integers t; k.
All polynomials p of degree at most k , 1 such that yi 6= p(xi) for at
most t values of i.
Input:
Output:
The basic solution idea in Welch-Berlekamp and our algorithm is to nd
an algebraic description of all the given points, and to then use the algebraic
description to extract p. The algebraic description we settle for is an \algebraic
curve in the plane", i.e., a polynomial Q(x; y) in two variables x and y such
that Q(xi ; yi) = 0 for every value of x and y. Given this basic strategy, the
performance of the algorithm depends on the choice of the degree of Q which
allows for such a curve to exist, and still be useful! (For example if we allow Q to
be 0, or if we pick the degree of Q be n in x and 0 in y, the such polynomials do
exist, but are of no use. On the other hand a non-zero polynomial Q of degree
n=10 in x and 0 in y may be useful, but will probably not exist for the given
data points.)
To determine what kind of polynomial Q we should search for, weP
pick two parameters l and m and impose the following conditions on Q(x; y) = i;j qij xiyj :
1. Q should not be the zero polynomial. (I.e., some qij should be non-zero.)
2. qij is non-zero implies j m and i + (k , 1)j l. (The reason for this
restriction will become clear shortly.)
3. Q(xi; yi ) = 0 for every given pair (xi ; yi ).
Now consider the task of searching for such a Q. This amounts to nding
values for the unknown coecients qij . On the other hand the conditions in
(3) above amount to homogeneous linear equations in qij . By elementary linear
algebra a solution to such a system exists and can be found in polynomial time
provided the number of equations (n) strictly exceeds the number of unknowns
(i.e., the number of (i; j) pairs such that 0 i; j, j m and i+(k , 1)j m). It
is easy to count the number of such coecients. The existence of such coecients
will determine our choice of m; l. Having determined such a polynomial we will
apply the following useful lemma to show that p can be extracted from Q.
Lemma 20 [1]. Let Q(x; y) = Pi;j qijxi yj be such that qij = 0 for every i; j
with i + (k , 1)j > l. Then if p(x) is polynomial of degree k , 1 such that for
strictly more than l values of i, yi = p(xi ) and Q(xi; yi ) = 0, then y , p(x)
divides the polynomial Q(x; y).
Proof. Consider rst the polynomial g(x) obtained from Q by substituting y =
p(x). Notice that the term qij xiyj becomes a polynomial in x of degree i+(k , 1)j
which by property (2) above becomes a polynomial of degree at most l in x. Thus
g(x) = Q(x; p(x)) becomes a polynomial in x of degree at most l. Now, for every
i such that yi = p(xi ) and Q(xi; yi ) = 0, we have that g(xi ) = Q(xi ; p(xi)) = 0.
But there are more than l such values of i. Thus g is identically zero. This
immediately implies that Q(x; y) is divisible by y , p(x). (The division theorem
for polynomials says that if a polynomial h(y) evaluates to 0 at y = then
y , divides h(y). Applying this fact to the polynomial Qx (y) = Q(x; y) and
y = p(x), we obtain the desired result. Notice in doing so, we are switching our
perspective. We are thinking of Q as a polynomial in y with coecients from
the ring of polynomials in x.)
Going back to the choice of m and l, we have several possible choices. In one
extreme we can settle for m = 1 and then if l (n + k)=2, then we nd that the
number of coecients is more than n. In this case the polynomial Q(x; y) found
by the algorithm is of the form A(x)y + B(x). Lemma 20 above guarantees that
if t b(n , k)=2c then y , p(x) divides Q. Thus p(x) = ,B(x)=A(x) and can be
computed easily by a simple polynomial division. Thus in this case we can decode
from b(n , k)=2c errors thus recovering the results of [20]. In fact, in this case
the algorithm essentially mimics the [20] algorithm, though the correspondence
may not be immediately obvious.
p
p
At a di erent extreme one p
may pick m n=k and l nk and in this case
Lemma 20 works for t n , 2 nk. In this case to recover p(x) from Q, one rst
factors the bivariate polynomial Q. This gives a list of all polynomial pj (x) such
that y ,pj (x) divides Q. From this list we pull out all the polynomials pj such that
pj (xi ) 6= yi for at most t values of xi . Thus in this case also we have a polynomial
time algorithm provided Q can be factored in polynomial time. Fortunately, such
algorithms are known, due to Kaltofen [8] and Grigoriev [7] (see Kaltofen [9] for
a survey of polynomial factorization algorithms). For k=n ! 0, the number of
errors corrected by this algorithm approaches (1 , o(1))n.
A more detailed analysis of this algorithm and the number of errors corrected
by it appear in [16]. The result shows that this given an [n; n; (1 , )n]q Reed
Solomon code, the number of errors corrected by this algorithm approaches
n 1 , 1 +1 , 2 where =
$r
%
2 1 1
+4,2 :
A plot of this curve against appears in Figure 1. Also shown in the gure
are the distance of the code ((1 , )n) and the classical-error correction bound
((1 , )=2n).
6 Open questions
Given that the fundamental maximum likelihood decoding problem is NP-hard
for a general linear code, the next direction to look to is a bounded distance
decoding algorithm for every [n; k; d]q linear code. The bottleneck to such an
approach is that in general we can't compute d in polynomial time, due to the
recent result of Vardy [18]. Thus the next step in this direction seems to suggest
an application of approximation algorithms:
Open Problem 1 Given an n k matrix G, approximate the distance d of the
code CG to within a factor of (n).
[t]
1
New Correction Bound
Diameter Bound (1 - x)
Classical Correction Bound (1 - x)/2
0.8
0.6
error (e/n)
0.4
0.2
0
0
0.2
0.4
rate (k/n)
0.6
0.8
1
Fig. 1. Fraction of errors corrected by the algorithm from [16] plotted against the rate
of the code. Also plotted are the distance of the code and the classical error-correction
bound.
The goal here is to nd the smallest factor (n) for which a polynomial time
approximation algorithm exists. Currently no non-trivial (i.e., with (n) = o(n))
approximation algorithm is known. A non-trivial (n) approximation algorithm
would then suggest the following candidate for bounded distance decoding:
Open Problem 2 Given an n k matrix G, a word R 2 n and an integer
t, nd all codewords within a Hamming distance of t from R, or show that the
minimum distance of the code is less than t 1(n).
A similar problem is posed by Vardy [18] for 1 = 2. Here the hope would
be to nd the smallest value of 1 for which a polynomial time algorithm exists.
While there is no immediate formal reasoning to believe so it seems reasonable
to believe that 1 will be larger than .
Next we move to the questions in the area of design of ecient codes, motivated by the work of Spielman [15].
Open Problem 3 For every > 0, design a family of [n; n; n]2 codes Cn so
that the bounded distance problem on Cn with parameter t n can be solved in
linear time.
The goal above is to make as large as possible for every xed . Spielman's
result allows for the construction codes which match the best known values of
for any [n; n; n]2 linear code. However the value of is still far from in these
results.
We now move towards questions directed towards decoding Reed-Solomon
codes. We direct the reader's attention to Figure 1. Clearly every point above
the solid curve and below the distance bound of the code, represents an open
problem. In particular we feel that the following version maybe solvable in polynomial time:
Open Problem 4 Find a bounded distance decoding algorithm
for an [n; n; (1,
)n]q Reed Solomon code that decodes up to t (1 , p)n errors.
The motivation for this particular version is that in order to solve the bounded
distance decoding problem, one needs to ensure that the number of outputs (i.e.,
the number codewords within the given bound t) is polynomial in n. Such a
bound does exist for the value of t as given above [6, 12], thus raising the hope
that this problem may be solvable in polynomial time also.
Similar questions may also be raised about decoding multivariate polynomials. In particular, we don't have polynomial time algorithms matching the
bounded distance decoding algorithm from [16], even for the case of bivariate
polynomials. This we feel may be the most tractable problem here.
Open Problem 5 Find a bounded distance decoding algorithm
for the bivariate
p
polynomial code Cpoly;2;n;n that decodes up to t (1 , 2)n2 errors.
References
1. S. Ar, R. Lipton, R. Rubinfeld and M. Sudan. Reconstructing algebraic
functions from mixed data. SIAM Journal on Computing, to appear. Preliminary
version in Proceedings of the 33rd Annual IEEE Symposium on Foundations of
Computer Science, pp. 503{512, 1992.
2. E. R. Berlekamp. Algebraic Coding Theory. McGraw Hill, New York, 1968.
3. E. R. Berlekamp. Bounded Distance +1 Soft-Decision Reed-Solomon Decoding.
In IEEE Transactions on Information Theory, pages 704-720, vol. 42, no. 3, May
1996.
4. E. R. Berlekamp, R. J. McEliece and H. C. A. van Tilborg. On the inherent intractability of certain coding problems. IEEE Transactions on Information
Theory, 24:384{386, 1978.
5. R. DeMillo and R. Lipton. A probabilistic remark on algebraic program testing. Information Processing Letters, 7(4):193{195, June 1978.
6. O. Goldreich, R. Rubinfeld and M. Sudan. Learning polynomials with
queries: The highly noisy case. Proceedings of the 36th Annual IEEE Symposium
on Foundations of Computer Science, pp. 294{303, 1995.
7. D. Grigoriev. Factorization of Polynomials over a Finite Field and the Solution
of Systems of Algebraic Equations. Translated from Zapiski Nauchnykh Seminarov
Lenningradskogo Otdeleniya Matematicheskogo Instituta im. V. A. Steklova AN
SSSR, Vol. 137, pp. 20-79, 1984.
8. E. Kaltofen. A Polynomial-Time Reduction from Bivariate to Univariate Integral Polynomial Factorization. In 23rd Annual Symposium on Foundations of
Computer Science, pages 57-64, 1982.
9. E. Kaltofen. Polynomial factorization 1987{1991. LATIN '92, I. Simon (Ed.)
Springer LNCS, v. 583:294{313, 1992.
10. R. Lidl and H. Niederreiter. Introduction to Finite Fields and their Applications. Cambridge University Press, 1986
11. F. J. MacWilliams and N. J. A. Sloane. The Theory of Error-Correcting
Codes. North-Holland, Amsterdam, 1981.
12. J. Radhakrishnan. Personal communication, January, 1996.
13. J. T. Schwartz. Fast probabilistic algorithms for veri cation of polynomial identities. Journal of the ACM, 27(4):701{717, 1980.
14. M. Sipser and D. A. Spielman. Expander codes. IEEE Transactions on Information Theory, 42(6):1710{1722, 1996.
15. D. A. Spielman. Linear-time encodable and decodable error-correcting codes.
IEEE Transactions on Information Theory, 42(6):1723{1731, 1996.
16. M. Sudan. Decoding of Reed Solomon codes beyond the error-correction
bound.
Journal of Complexity, 13(1):180-193, March 1997. See also
http://theory.lcs.mit.edu/~ madhu/papers.html for a more recent version.
17. J. H. van Lint. Introduction to Coding Theory. Springer-Verlag, New York, 1982.
18. A. Vardy. Algorithmic complexity in coding theory and the minimum distance
problem. Proceedings of the Twenty-Ninth Annual ACM Symposium on Theory of
Computing, pp. 92{109, 1997.
19. B. L. van der Waerden. Algebra, Volume 1. Frederick Ungar Publishing Co.,
Inc., page 82.
20. L. Welch and E. R. Berlekamp. Error correction of algebraic block codes. US
Patent Number 4,633,470, issued December 1986.
21. R. E. Zippel. Probabilistic algorithms for sparse polynomials. EUROSAM '79,
Lecture Notes in Computer Science, 72:216{226, 1979.
This article was processed using the LATEX macro package with LLNCS style
Source Exif Data:
File Type : PDF File Type Extension : pdf MIME Type : application/pdf PDF Version : 1.2 Linearized : No Page Count : 16 Create Date : 1999:04:05 13:27:50 Producer : Acrobat Distiller 3.0 for Power Macintosh Creator : dvips 5.58 Copyright 1986, 1994 Radical Eye Software Title : MIT-LCS-TR-731.psEXIF Metadata provided by EXIF.tools