Summary  Encode
and decode files using a Hamming (7,4) algorithm. 
Topics 
Error
correcting codes Matrix multiplication Binary representations of data Binary I/O (optional) 
Audience 
Appropriate
for CS2 or a later
course. A simplified version could be used in CS1. 
Difficulty 
This
is an intermediate assignment, taking 1 or 2 weeks for a CS2
student. 
Strengths 
Integrates
disparate topics from programming and discrete math into a cool
application. Solves a classical problem in Computer Science. 
Weaknesses 
Like
so many other topics in CS, there are some very inelegant ways to
approach parts of the problem. For example, if encoding
Integers, Java's Integer
class contains a method: static
String
toBinaryString(int i), which
students can use to pull apart the Integer into its binary
representation rather than use the more appropriate bitwise operators. 
Dependencies 
Requires
an understanding of matrix multiplication and binary representations of
data, including bitwise operators and binary I/O. 
Variants 
There
are three steps to implementing Hamming codes and you can make
each easier or more difficult. Make the Data Bits Accessible Easy  Give the students an array of bits. Harder  Give the students an array of Integers and make them pull out the bits of each using bitwise operators. Most Difficult  Give the students an arbitrary file and make them read it byte by byte pulling the bits out of each byte. Calculate the Parity Bits Brute Force, but Easy to Understand  Add the appropriate data bits together to calculate each parity bit. Easy, but Harder to Understand  Use matrix multiplication (mod 2) to generate the code block. Most Difficult  Have the students develop the matrices needed on their own. Storing the Results Easy  Have the students write the code bits to the screen. Harder  Append a leading 0 to each code block and have the students write the byte to a file. Most Difficult  Have the students develop a BitStream class and write the results to a file using it. 
(The following discussion borrows, in part, from Wikipedia's Hamming code page.) Hamming codes are errorcorrecting codes. They are named after their inventor, Richard Hamming. Hamming was working for Bell labs in the 1940s. He grew frustrated with how often he had to restart his programs because a read error occurred. He developed Hamming codes as a way of detecting and correcting errors. Hamming codes can detect and correct singlebit errors or can detect (but not correct) up to two simultaneous bit errors.
Bit
position 
1 
2 
3 
4 
5 
6 
7 
... 

Encoded
data bits 
p1 
p2 
d1 
p3 
d2 
d3 
d4 

Parity 
p1 
X 

X 

X 

X 

p2 

X 
X 


X 
X 

p3 



X 
X 
X 
X 
Shown are 7 encoded bits (3 parity, 4 data) but the pattern continues indefinitely. The key thing about Hamming Codes that can easily be seen from visual inspection is that any given bit has a unique parity bit coverage. For example, the only data bit covered by p1 and p3 (and no other bits) is bit 5 (d2). It is this unique bit coverage that lets a Hamming Code correct any single bit error.
Suppose we
want to use Hamming (7,4) to encode the byte 1011 0001.
The first thing we will do is split the byte into two Hamming code
data blocks, 1011 and 0001.
We expand the first block on the left to 7 bits: _ _ 1 _ 0 1 1.
The first missing bit (bit 1) is 0, because adding bits 3, 5 and 7
gives an even number (2).
The second missing bit (bit 2) is 1, because adding bits 3, 6 and 7
gives an odd number (3).
The last missing bit (bit 4) is 0, because adding bits 5, 6 and 7
gives an even number (2).
This means our 7 bit block is: 0 1
1 0
0 1 1
We expand the
second block to 7 bits using similar logic, giving: 1 1
0 1
0 0 1
Extra info about this assignment: