Linear Block Codes II
Author: Peter Kim | Professor: Mostafa Darabi

This is a continuation from Linear Block Codes I

Error-Detection Capability

Error detection is a critical feature of block codes, enabling the receiver to identify distortions introduced during transmission over a noisy channel.

Error Detection Mechanism

When a codeword \( \mathbf{v} \) is transmitted over a binary channel, noise may distort it, resulting in a received vector:

\[ \mathbf{r} = \mathbf{v} + \mathbf{e}, \]

where \( \mathbf{e} \) is the error pattern, and addition is modulo-2. The Hamming distance between the transmitted and received vectors is:

\[ d(\mathbf{v}, \mathbf{r}) = w_H(\mathbf{e}), \]

where \( w_H(\mathbf{e}) = \ell \) is the number of errors. For a linear block code \( C(n, k) \) with minimum distance \( d_{\text{min}} \), any two distinct codewords differ by at least \( d_{\text{min}} \) positions. Thus, if \( 0 < \ell < d_{\text{min}} \), \( \mathbf{r} \) cannot be a codeword in \( C \), i.e., \( \mathbf{r} \notin C \).

The receiver detects an error if \( \mathbf{r} \) is not a valid codeword, which can be checked using the parity-check matrix \( \mathbf{H} \):

\[ \mathbf{r} \cdot \mathbf{H}^T \neq \mathbf{0}. \]

The vector \( \mathbf{s} = \mathbf{r} \cdot \mathbf{H}^T \) is called the syndrome. A code with minimum distance \( d_{\text{min}} \) can detect all error patterns with \( d_{\text{min}} - 1 \) or fewer errors. However, if \( \ell = d_{\text{min}} \), \( \mathbf{e} \) could match a codeword, making detection impossible in that case.

Syndrome Calculation

The syndrome is computed as:

\[ \mathbf{s}_{1 \times (n-k)} = \mathbf{r}_{1 \times n} \cdot \mathbf{H}^T_{n \times (n-k)}, \]

where \( \mathbf{s} = (s_0, s_1, \ldots, s_{n-k-1}) \). If \( \mathbf{s} = \mathbf{0} \), \( \mathbf{r} \) is a codeword; otherwise, an error is detected.

A linear block code \( C(n, k) \) with minimum distance \( d_{\text{min}} \) can detect \( 2^n - 2^k \) error patterns of length \( n \), as there are \( 2^n \) possible error patterns, and \( 2^k \) are codewords (including the zero vector).

And an error pattern \( \mathbf{e} \) is undetectable if it equals a nonzero codeword in \( C \), since then \( \mathbf{r} = \mathbf{v} + \mathbf{e} \) is another codeword, and \( \mathbf{r} \cdot \mathbf{H}^T = \mathbf{0} \).

Example

Design a syndrome circuit for the \( C(4, 2) \) code with generator matrix:

\[ \mathbf{G} = \begin{bmatrix} 1 & 0 & 1 & 1 \\ 0 & 1 & 0 & 1 \end{bmatrix}. \]

The parity-check matrix, from \( \mathbf{G} = [\mathbf{I}_2 | \mathbf{P}] \) where \( \mathbf{P} = \begin{bmatrix} 1 & 1 \\ 0 & 1 \end{bmatrix} \), is:

\[ \mathbf{H} = [\mathbf{P}^T | \mathbf{I}_2] = \begin{bmatrix} 1 & 0 & 1 & 0 \\ 1 & 1 & 0 & 1 \end{bmatrix}. \]

The syndrome is:

\[ \mathbf{s} = (s_0, s_1) = \mathbf{r} \cdot \mathbf{H}^T = (r_0, r_1, r_2, r_3) \cdot \begin{bmatrix} 1 & 1 \\ 0 & 1 \\ 1 & 0 \\ 0 & 1 \end{bmatrix} = (r_0 + r_2, r_0 + r_1 + r_3), \]

where addition is modulo-2. The syndrome circuit computes:

If \( \mathbf{s} \neq (0, 0) \), an error is detected.

Probability of Undetected Errors

An error pattern is undetectable if it matches a nonzero codeword. For a code \( C(n, k) \), let \( A_i \) be the number of codewords of weight \( i \), forming the weight distribution \( A_0, A_1, \ldots, A_n \). Over a Binary Symmetric Channel (BSC) with error probability \( p \), the probability of an undetected error is:

\[ P_U(E) = \sum_{i=1}^n A_i p^i (1 - p)^{n-i}, \]

since \( A_0 = 1 \) (the zero codeword) does not contribute to errors. Given \( A_i = 0 \) for \( i < d_{\text{min}} \), this reduces to:

\[ P_U(E) = \sum_{i=d_{\text{min}}}^n A_i p^i (1 - p)^{n-i}. \]

For the \( C(4, 2) \) code (\( d_{\text{min}} = 2 \)): \( A_0 = 1 \), \( A_1 = 0 \), \( A_2 = 1 \), \( A_3 = 2 \), \( A_4 = 0 \), so:

\[ P_U(E) = 1 \cdot p^2 (1 - p)^2 + 2 \cdot p^3 (1 - p)^1 + 0 \cdot p^4 (1 - p)^0 = p^2 (1 - p)^2 + 2 p^3 (1 - p). \]

Syndrome-Based Error Correction

The syndrome depends only on the error pattern, not the transmitted codeword. For \( \mathbf{r} = \mathbf{v} + \mathbf{e} \):

\[ \mathbf{s} = \mathbf{r} \cdot \mathbf{H}^T = (\mathbf{v} + \mathbf{e}) \cdot \mathbf{H}^T = \mathbf{v} \cdot \mathbf{H}^T + \mathbf{e} \cdot \mathbf{H}^T = \mathbf{0} + \mathbf{e} \cdot \mathbf{H}^T = \mathbf{e} \cdot \mathbf{H}^T, \]

since \( \mathbf{v} \in C \) implies \( \mathbf{v} \cdot \mathbf{H}^T = \mathbf{0} \).

Error correction proceeds as follows:

The equation \( \mathbf{e} \cdot \mathbf{H}^T = \mathbf{s} \) yields \( n - k \) equations with \( n \) unknowns. There are \( 2^k \) solutions (cosets of \( C \)), and the most probable \( \mathbf{e} \) has minimum weight, assuming fewer errors are more likely over a BSC.

Example

For a \( C(5, 2) \) code with generator matrix:

\[ \mathbf{G} = \begin{bmatrix} 0 & 1 & 1 & 1 & 0 \\ 1 & 1 & 1 & 0 & 1 \end{bmatrix}_{2 \times 5}, \]

if the observed vector is \( \mathbf{r} = (1, 0, 0, 0, 0) \) over a BSC with \( p = 0.1 \), what is the most probable transmitted codeword?

First, compute the parity-check matrix \( \mathbf{H} \). Since \( \mathbf{G} \) is not in systematic form \( [\mathbf{I}_k | \mathbf{P}] \), we derive \( \mathbf{H} \) such that \( \mathbf{G} \cdot \mathbf{H}^T = \mathbf{0} \). Given:

\[ \mathbf{H} = \begin{bmatrix} 1 & 0 & 0 & 0 & 1 \\ 0 & 1 & 0 & 1 & 1 \\ 0 & 0 & 1 & 1 & 1 \end{bmatrix}_{3 \times 5}, \]

we can verify that \( \mathbf{g}_0 \cdot \mathbf{h}_j = 0 \) and \( \mathbf{g}_1 \cdot \mathbf{h}_j = 0 \) for all rows. Compute the syndrome:

\[ \mathbf{s} = \mathbf{r} \cdot \mathbf{H}^T = (1, 0, 0, 0, 0) \cdot \begin{bmatrix} 1 & 0 & 0 \\ 0 & 1 & 0 \\ 0 & 0 & 1 \\ 0 & 1 & 1 \\ 1 & 1 & 1 \end{bmatrix} = (1 + 0 + 0 + 0 + 0, 0 + 0 + 0 + 0 + 0, 0 + 0 + 0 + 0 + 0) = (1, 0, 0). \]

Solve for the error pattern \( \mathbf{e} \):

\[ (1, 0, 0) = \mathbf{e} \cdot \mathbf{H}^T = (e_0, e_1, e_2, e_3, e_4) \cdot \begin{bmatrix} 1 & 0 & 0 \\ 0 & 1 & 0 \\ 0 & 0 & 1 \\ 0 & 1 & 1 \\ 1 & 1 & 1 \end{bmatrix}, \]

yielding the system:

\[ \begin{cases} e_0 + e_4 = 1 \\ e_1 + e_3 + e_4 = 0 \\ e_2 + e_3 + e_4 = 0 \end{cases} \]

Possible solutions: \( \{(1, 0, 0, 0, 0), (0, 0, 0, 1, 1), (0, 1, 1, 0, 1), (1, 1, 1, 1, 0)\} \) (weights 1, 2, 3, 4). For \( p = 0.1 < 0.5 \), lower weight is more probable, so \( \mathbf{e} = (1, 0, 0, 0, 0) \). Thus:

\[ \hat{\mathbf{v}} = \mathbf{r} + \mathbf{e} = (1, 0, 0, 0, 0) + (1, 0, 0, 0, 0) = (0, 0, 0, 0, 0), \]

a valid codeword. If we consider \( p = 0.6\), \( p = 0.6 > 0.5 \), higher weight is more probable, so \( \mathbf{e} = (1, 1, 1, 1, 0) \), and:

\[ \hat{\mathbf{v}} = (1, 0, 0, 0, 0) + (1, 1, 1, 1, 0) = (0, 1, 1, 1, 0), \]

also a codeword.

Standard Array

The standard array is a systematic method to partition \( 2^n \) possible received vectors into \( 2^k \) disjoint subsets for decoding a \( C(n, k) \) code.

Properties of Standard Array

For a code \( C(n, k) \) with codewords \( \mathbf{v}_0, \mathbf{v}_1, \ldots, \mathbf{v}_{2^k-1} \), the standard array has:

Each column \( D_j = \{\mathbf{v}_j, \mathbf{v}_j + \mathbf{e}_1, \ldots, \mathbf{v}_j + \mathbf{e}_{2^{n-k}-1}\} \) contains \( 2^{n-k} \) vectors, with the top being a codeword.

Decoding Using Standard Array

For a received vector \( \mathbf{r} \) in row \( i \) and column \( j \):

\[ \mathbf{r} = \mathbf{e}_i + \mathbf{v}_j, \]

where \( \mathbf{e}_i \) is the error pattern and \( \mathbf{v}_j \) is the decoded codeword. The \( \mathbf{e}_i \)’s are correctable error patterns.

Theorems on Error Correction

Theorem 1: A \( C(n, k) \) code can correct \( 2^{n-k} \) error patterns, one per coset.

Theorem 2: For \( d_{\text{min}} \), all \( n \)-tuples of weight:

\[ t = \left\lfloor \frac{d_{\text{min}} - 1}{2} \right\rfloor \]

or less can be coset leaders. We can prove this. If \( \mathbf{x} \) and \( \mathbf{y} \) with \( w_H(\mathbf{x}), w_H(\mathbf{y}) \leq t \) are in the same coset, then \( \mathbf{x} + \mathbf{y} \in C \) and:

\[ w_H(\mathbf{x} + \mathbf{y}) \leq w_H(\mathbf{x}) + w_H(\mathbf{y}) \leq 2t < d_{\text{min}}, \]

a contradiction unless \( \mathbf{x} = \mathbf{y} \). Thus, all \( \sum_{i=0}^t \binom{n}{i} \leq 2^{n-k} \) patterns fit.

Theorem 3: All vectors in a coset have the same syndrome: \( (\mathbf{e}_i + \mathbf{v}_j) \cdot \mathbf{H}^T = \mathbf{e}_i \cdot \mathbf{H}^T \).

Theorem 4: Different cosets have distinct syndromes: if \( \mathbf{e}_i \cdot \mathbf{H}^T = \mathbf{e}_j \cdot \mathbf{H}^T \), then \( \mathbf{e}_i + \mathbf{e}_j \in C \), implying \( i = j \) for distinct coset leaders.

Example

Construct the standard array for the \( C(5, 2) \) code with:

\[ \mathbf{G} = \begin{bmatrix} 0 & 1 & 1 & 1 & 0 \\ 1 & 1 & 1 & 0 & 1 \end{bmatrix}, \]

minimizing error probability for \( p < 0.5 \). Codewords:

\[ \begin{aligned} \mathbf{v}_0 &= (0, 0) \cdot \mathbf{G} = (0, 0, 0, 0, 0), \\ \mathbf{v}_1 &= (0, 1) \cdot \mathbf{G} = (1, 1, 1, 0, 1), \\ \mathbf{v}_2 &= (1, 0) \cdot \mathbf{G} = (0, 1, 1, 1, 0), \\ \mathbf{v}_3 &= (1, 1) \cdot \mathbf{G} = (1, 0, 0, 1, 1). \end{aligned} \]

Standard array (coset leaders of minimum weight):

(0, 0, 0, 0, 0)(1, 1, 1, 0, 1)(0, 1, 1, 1, 0)(1, 0, 0, 1, 1)
\( \mathbf{e}_1 = (1, 0, 0, 0, 0) \)(0, 1, 1, 0, 1)(1, 1, 1, 1, 0)(0, 0, 0, 1, 1)
\( \mathbf{e}_2 = (0, 1, 0, 0, 0) \)(1, 0, 1, 0, 1)(0, 0, 1, 1, 0)(1, 1, 0, 1, 1)
\( \mathbf{e}_3 = (0, 0, 1, 0, 0) \)(1, 1, 0, 0, 1)(0, 1, 0, 1, 0)(1, 0, 1, 1, 1)
\( \mathbf{e}_4 = (0, 0, 0, 1, 0) \)(1, 1, 1, 1, 1)(0, 1, 1, 0, 0)(1, 0, 0, 0, 1)
\( \mathbf{e}_5 = (0, 0, 0, 0, 1) \)(1, 1, 1, 0, 0)(0, 1, 1, 1, 1)(1, 0, 0, 1, 0)
\( \mathbf{e}_6 = (1, 1, 0, 0, 0) \)(0, 0, 1, 0, 1)(1, 0, 1, 1, 0)(0, 1, 0, 1, 1)
\( \mathbf{e}_7 = (1, 0, 1, 0, 0) \)(0, 1, 0, 0, 1)(1, 1, 0, 1, 0)(0, 0, 1, 1, 1)

Coset leaders are chosen with minimum weight (mostly weight 1) for \( p < 0.5 \), ensuring maximum likelihood decoding (MLD).

Look-Up Table Decoding

Storing the entire standard array is inefficient. Instead, a look-up table maps syndromes to correctable error patterns.

Syndrome Look-Up Table

For \( C(n, k) \), there are \( 2^{n-k} \) syndromes, each corresponding to a unique coset leader. Steps:

  1. Compute \( \mathbf{s} = \mathbf{r} \cdot \mathbf{H}^T \).
  2. Find \( \mathbf{e}_\ell \) such that \( \mathbf{e}_\ell \cdot \mathbf{H}^T = \mathbf{s} \).
  3. Decode \( \hat{\mathbf{v}} = \mathbf{r} + \mathbf{e}_\ell \).

Example

For the \( C(5, 2) \) code, with \( \mathbf{H} = \begin{bmatrix} 1 & 0 & 0 & 0 & 1 \\ 0 & 1 & 0 & 1 & 1 \\ 0 & 0 & 1 & 1 & 1 \end{bmatrix} \), and \( \mathbf{r} = (0, 0, 1, 1, 0) \) over a BSC with \( p < 0.5 \):

Syndrome table:

Error Pattern (Coset Leader)Syndrome
(0, 0, 0, 0, 0)(0, 0, 0)
(1, 0, 0, 0, 0)(1, 0, 0)
(0, 1, 0, 0, 0)(0, 1, 0)
(0, 0, 1, 0, 0)(0, 0, 1)
(0, 0, 0, 1, 0)(0, 1, 1)
(0, 0, 0, 0, 1)(1, 1, 1)
(1, 1, 0, 0, 0)(1, 1, 0)
(1, 0, 1, 0, 0)(1, 0, 1)

Compute:

\[ \mathbf{s} = (0, 0, 1, 1, 0) \cdot \mathbf{H}^T = (0 + 0 + 1 + 1 + 0, 0 + 0 + 0 + 1 + 0, 0 + 0 + 1 + 1 + 0) = (0, 1, 0), \]

so \( \mathbf{e} = (0, 1, 0, 0, 0) \), and:

\[ \hat{\mathbf{v}} = (0, 0, 1, 1, 0) + (0, 1, 0, 0, 0) = (0, 1, 1, 1, 0), \]

a codeword, consistent with MLD for \( p < 0.5 \).

Logic Level Implementation

Look-up table decoding can be implemented efficiently using switching functions derived from the syndrome table. For a linear block code \( C(n, k) \), the decoding table serves as a truth table for \( n \) functions:

\[ e_i = f_i(s_0, s_1, \ldots, s_{n-k-1}), \quad i = 0, 1, \ldots, n-1, \]

where \( s_i \) is the \( i \)-th entry of the syndrome vector \( \mathbf{s} \), and \( e_i \) is the \( i \)-th entry of the estimated error pattern \( \mathbf{e} \). These functions can be expressed using logical operations (e.g., AND, OR, NOT) based on the syndrome-to-error mapping.

For example, in the \( C(5, 2) \) code from Example (***), the syndrome table showed:

These logical expressions can be implemented in hardware (e.g., using gates) for real-time decoding, avoiding the need to store the entire standard array.

Error-Detection and Error-Correction Capabilities

The error-detection and correction capabilities of a block code are closely tied to its weight distribution and that of its dual code. This section revisits error detection and introduces polynomial representations and the MacWilliams identity.

Weight Distribution in Polynomial Form

The weight distribution of \( C(n, k) \) can be represented as a polynomial:

\[ A(z) = A_0 + A_1 z + A_2 z^2 + \cdots + A_n z^n, \]

where \( A_0 = 1 \) and \( A_i \) counts codewords of weight \( i \). Similarly, for the dual code \( C_d(n, n-k) \), the weight distribution is:

\[ B(z) = B_0 + B_1 z + B_2 z^2 + \cdots + B_n z^n, \]

with \( B_0 = 1 \). The MacWilliams identity relates these polynomials:

\[ A(z) = 2^{n-k} (1 + z)^n B\left( \frac{1 - z}{1 + z} \right). \]

This identity allows computation of \( A(z) \) from \( B(z) \) or vice versa, leveraging the duality between \( C \) and \( C_d \).

Probability of Undetected Error via Dual Code

Using the MacWilliams identity, the probability of an undetected error can be expressed in terms of the dual code’s weight distribution. For \( C(n, k) \) over a BSC with error probability \( p \):

\[ P_U(E) = 2^{n-k} B(1 - 2p) - (1 - p)^n, \]

where:

\[ B(1 - 2p) = \sum_{i=0}^n B_i (1 - 2p)^i. \]

This follows because \( P_U(E) \) is the probability of receiving a nonzero codeword in \( C \), and the dual code’s weight enumerator evaluated at \( 1 - 2p \) accounts for the channel’s effect, adjusted by subtracting the probability of no error (\( (1 - p)^n \)).