Finding the Differences
(This page is full of technical detail. The next page is much less technical - it discusses the differences I found in Episode 3 (Fit the Third), followed by the differences found in other episodes.)
One day, quite out of the blue, I received a complete set of off-air recordings by mail. They were the original episodes. I thought about how to share them with the world. Of course, putting entire episodes online would create a copyright problem, and there could be no "abandonware" justification as the CDs are still on sale.
The interesting parts of the off-air recordings are the places where they differ from the CDs that you can buy. I set out to identify these places, ignoring all the material that is the same. By concentrating on these "abandoned" pieces of the episodes, I avoid redistributing the material on the CDs.
Take two sound recordings; one from the commercial CD release of HHGTTG and another from a recording of a radio broadcast. The two recordings contain the same episode, but there may be edits. Material may have been added, changed or deleted.
The objective is to find these differences with accuracy and minimal effort. One method that is guaranteed to work is a manual comparison, e.g. listening to a scene from one recording, then the other. However, this is painstaking and error-prone.
If the two recordings were digital, it might be possible to locate differences using a program (e.g. diff). However, the recordings are sourced from analogue tape. Although they sound the same to a human, they are composed of totally different digital data.
The biggest problem is that the recordings are not aligned. When recordings are aligned, events in one recording happen at exactly the same time in the other.
However, these recordings are not aligned, and aligning them is not simply a matter of determining the alignment offset (right) required to synchronise one with the other. This is because one is slightly faster than the other, so the alignment offset changes during each recording. Worse still, the relative speed also changes, so the alignment offset changes in a non-linear fashion. Therefore, you can't align two recordings together by finding the alignment offset at the beginning and end.
The change of speed happens because of imperfections in the recording technology. Analogue tape recorders attempt to ensure that tape moves at a constant speed. The mechanism that does this is very accurate, assuming decent equipment is used, but it is not exact. When a recording is copied many times, tiny inaccuracies accumulate: each inaudible in isolation, but very noticeable when added together. There are differences in speed in every pair of recordings that I listened to.
Aligned recordings can be compared in two ways. Firstly, a human could listen to both recordings at the same time, one playing in the "left" channel of a stereo system, the other in the "right" channel. The result is mono sound when the recordings are identical; differences are immediately obvious.
Secondly, if the synchronisation is perfect and the amplitude of both recordings is the same, then one recording can be subtracted from the other. The result is silence when the recordings are identical and a mixture of both recordings when they are not.
How to Align and Compare Two Analogue RecordingsAlignment and comparison would be easy if there was an analogue equivalent of the diff tool, which compares text files. Others have realised a need for such a tool. "Audio Diff" has been considered as an extension for the excellent Audacity sound-manipulation software. However, this is only a proposal. It seems there is no code yet.
Music Alignment Tool CHest
The authors of the Audacity web page give a number of useful hints. Firstly, some software has already been written to align analogue recordings. The Music Alignment Tool CHest (MATCH), is a program written by music researchers to align sound files automatically. While the alignment process requires substantial CPU time and memory, the results are quite accurate and no human intervention is required. There is an interesting research paper about MATCH.
MATCH can produce "session files", aligning points in one sound file
to another. I experimented with MATCH for some time, but had some difficulty
using it to synchronise the HHGTTG episodes. It is not well-suited
to detecting differences, as it assumes the input files are the same.
It is also intended for music, and it seems that some parts of HHGTTG,
particularly the quieter periods, the voice acting and the weird sound
effects can defeat its alignment algorithm. Additionally, I was
not able to determine how its algorithm worked, as it is closed-source
and the research papers concentrate on results and music theory instead
Another hint from the Audacity web page was the suggestion of the use of dot plots to compare recordings. A dot plot is a two-dimensional graph, drawn from two data sets X and Y. X and Y are "strings"; i.e. sequences of data elements. I will use the notation X(i) to represent the i-th member of X.
In the following example, X and Y will be strings composed of text. This is what a programmer will assume you mean if you say "string" - but bear in mind that a string might be composed of any sort of data, including audio samples.
Let X = "wholly remarkable", and Y = "remarkable book".
Within the dot plot, the colour of every point (x, y) is defined by the difference d(x,y) = X(x) - Y(y). d(x,y) is zero if there is an exact match. In the example on the right, zero is drawn as black.
Diagonal lines appear within the dot plot whenever a chain of matches occur close together. One such diagonal line is visible in the corner of the dot plot, where "remarkable" occurs in both X and Y. This is not the only match; for example, "l" in "wholly" matches "l" in "remarkable", resulting in more black squares. But "remarkable" is the only sequence of matches.
The black diagonal line stands out within the noise of inferior matches. Using it, we can match X(7) to Y(0), X(8) to Y(1) and so on. That is, the dot plot allows us to align the two strings. In this case, the alignment offset is 7.
Dot plots make it possible to visualise this similarity even on a much larger scale, with much longer strings - such as entire audio files.
The dot plot also shows where edits have taken place. Suppose that X = "decided to build three ships, three Arks in space". This is then edited to produce Y = "decided to build three Arks in space". The black diagonal line (right) is broken at the point of the edit; it resumes some distance to the right. The dot plot allows us to both align the strings and mark the places where alignment is impossible, because something has been edited out.
Similarly, an edit might insert something entirely new. This would cause the black diagonal line to disappear completely, resuming elsewhere in the dot plot.
Dot plots work well for much more complicated strings, such as the sound recording of a complete episode of HHGTTG. I experimented with many different ways of producing the dot plots, and eventually settled on the following process:
- I normalised the sound level in both files and downmixed them to mono.
- I produced each X(i) value as the mean sample amplitude for a range of samples from si to si + s, where s is a scale factor (used as a "zoom" setting). I used the absolute value for each sample so that negative and positive samples did not cancel out.
- I wrote a program to draw dot plots. It allowed me to mark "waypoints" on them. The idea was to add waypoints whereever the black diagonal line was obvious. The image on the right illustrates how this worked: the line drawn between two waypoints gives the relationship between X and Y for that part of the graph. The program is not very full featured but source code is available if you want to see it.
- I went through each episode marking waypoints. Their locations were stored in a file for later use. The most difficult job is finding the diagonal line to begin with; once it is found, it is easy to trace. Differences between X and Y (signifying edits) are obvious because the line disappears completely.
Then, I loaded the two aligned recordings side-by-side in Audacity and routed one to the left channel, and the other to the right. I was able to listen to both playing at the same time. This provided a final check to ensure that I hadn't missed any differences while working with the dot plot. The result was a distinctive mono sound when the recordings matched. The alignment was usually exact: slight losses of alignment manifested themselves as a faint phasing effect.
The following pages talk about the differences that I actually found. There was no difference in either episode 1 or episode 2, which is strange as at least one line of monologue concerning the worst poet in the Universe did subsequently lead to legal problems, and thus it is rather surprising that it is still intact.
Episode 3 (Fit the Third) contains the best-known difference between the original broadcast and the CD release: the missing Magrathea scene.