1
Setting the scene
1.1 The problem
Next Saint Valentineās day you may be lucky enough to receive the message
from someone whose sentiments are pretty clear, in spite of not quite having mastered the word-processor.
This is a trivial example but also a very instructive one. It will lead us to quite a good understanding of what coding theory is all about. Before reading on, think for a few minutes about : how you know the message contains an error; why you are confident about the location and number of errors; why you are confident that you can correct the error(s); whether other similar cases may not be so easy to correct.
Let us now consider those questions in order. Words in English, and in any other natural language, are strings of letters taken from a finite alphabet, but not all such strings correspond to meaningful words: āIā does; āY O Uā does; āL M V Eā doesnāt; so we have detected an error in the second word. As for the location being letter 2 of word 2, this seems the most plausible thing which could have gone wrong, as the simple remedy of changing that M to an O restores a meaningful word and makes the complete message plausible. No other replacement for the M does this, so we have done something about question 3 too. But before we get too confident, couldnāt the correct message have been āI LIKE YOUā with errors made in the middle two letters of the second word? Well, yes, but you could argue that two errors are less likely than one because the sender is known to be not too bad at word processing. Finally, it is possible to receive a message which contains one error but the recipient can be unaware of its existence. A simple example of this is the received message:
I have no way of knowing whether this is really the senderās intention, or whether the real message is
transmitted with one error. To make progress here I would need to use information other than the message itself, for example, my name is not Lou, and I donāt know anyone of that name. In other words, I would be using the context of the message.
It is useful at this stage to make three general points arising from our examples:
1. Natural languages have built-in redundancy, and it is this which gives hope of being able to detect and correct errors.
2. Our confidence that our error correction is valid is greater if we have some assurance that a small number of errors is more likely than a larger number.
3. It may happen that we can detect that an error has been made, but cannot correct it with confidence.
The form in which we have already met redundancy is that not all strings of letters are meaningful words. Another is illustrated by observing the effect of simply cutting out large chunks of the original message. For example, every vowel and every space between words has been deleted from an English sentence and the result is
You will probably be able to reassemble the original message without too much difficulty, perhaps with a little help from an intelligent guess about the context.
An example of the third point is a one-word message received as L M V E. You can be sure an error has been made, but, without additional information, will be unable to decide between L O V E and L I V E.
1.2 The channel ā cause of the problem
We have seen how some features of a language, principally its redundancy, and facts about the sender, like being prone to make rare single errors but less likely to make more, can help the receiver to recover a slightly garbled message. If our messages are important enough to be guaranteed errorfree could we not simply put the onus on the sender to use a sufficiently thorough system of checks that no message containing errors was ever sent? The only thing which prevents this from being an excellent idea is that in most significant applications the errors do not arise from mistakes on the part of the sender, but rather, as a result of what happens to the message after leaving the sender and before arriving at the receiver. In other words, the communication channel has a vital role to play. For instance, in our first example the O received as an M may be the result of smudging.
A more clear-cut example is a conversation between two people at a rather noisy party. The speakerās words could be uttered perfectly clearly but his listener could fail to receive some of them, or receive a distorted version, due to cross-talk from other conversations, or loud music, etc. Here the idea (or fact, or comment, or etc) which the speaker wishes to transmit has to be encoded into a form suitable for the channel Here the channel is the air between speaker and listener; the words of the message are encoded as pressure waves in the air; these impinge on the listenerās ear; finally a complex decoding mechanism reinterprets these waves as words.
In coding theory and elsewhere the everyday term noise is used to denote any inherent feature of a communication channel which tends to distort messages, and which are generally outside the control of both sender and receiver. Noise is a fact of life in both the elementary examples we have mentioned so far and in much more sophisticated technological examples. To mention three fairly obvious ones: telephone lines are subject to unavoidable crackle; the signals which carry pictures of remote bits of the solar system back to Earth can be distorted by cosmic rays and solar flares; information stored in computer memories can be corrupted by the impact of stray alpha particles, ā¦ and so on.
1.3 Cunning coding ā solution of the problem
First consider a very simple situation in which the sender only needs to send one of two possible messages, say āyesā or ānoā, āstayā or āgoā, āattackā or āretreatā, etc. These days messages are often sent as digital pulses rather than as written or spoken words, so let us suppose our two possible messages are coded as 0 and 1. This has the virtue of simplicity, but the price to be paid is that a single error can mean disaster. If the general receives intelligence that his troops are vastly outnumbered, so sends ā0ā meaning āretreatāand a stray bit of electromagnetic noise corrupts this to ā1ā for āattackā, then the consequences could be most unpleasant, so unpleasant that one could not be expected to tolerate such occasional errors even if very rare. The source of the problem is that there is no redundancy at all in this system, so no chance of detecting that the received message is an error. It has been said, and I paraphrase slightly, that modern coding theory is all about replacing the redundancy we lose in going from natural (English) to artificial (digital) language in a sufficiently cunning way to enhance the error-correcting capability of the language.
It is very easy to give examples of simple ways of achieving this. If we stick with our primitive two-message system but this time agree to transmit 00 whenever we intend 0 or āretreatā and 11 whenever we intend 1 or āattackā, then there are just two messages, 0 and 1, and these are encoded by the codewords 00 and 11 respectively. Now suppose the channel is subject to noise which can corrupt codewords by changing a 0 to a 1 or a 1 to a 0, and the codeword 00 is sent. One of four things can happen. It may be received
as 00 with no corruption
or 01 with the second digit corrupted
or 10 with the first digit corrupted
or 11 if both digits are corrupted,
and a similar set of possibilities occurs if 11 is sent.
Now put yourself in the position of the receiver who only has available his pair of digits and no knowledge of what was sent. (To make things more friendly weāll make SiĆ¢n the sender and Rhidian the receiver.)
If Rhidian receives 00 he knows that either 00 was sent with no interference from noise, or that 11 was sent but both digits were corrupted. He is in a similar position if he receives 11. On the other hand if 01 is received it is certain that one of the two digits has been corrupted because 01 is not a codeword, but he has no idea whether 00 was sent and the second digit corrupted or 11 was sent and the first digit corrupted. Likewise if 10 is received.
It seems then that so much uncertainty still remains that nothing worth-while has been achieved. But notice how the situation changes if we know that under no circumsta...