Sound Ideas

Piano 2 - Size is everything

In part 1 I talked briefly about storage space, about how compressing a music player and a piece of music into less than 64KB is an important problem to solve.

Let's say, for example, that the demo is going to be five minutes long. For high quality music we need to choose a nice high sample rate, so let's do the same as CD and choose 44,100 samples per second. Each sample is two bytes in size and the music is stereo, so we have a total of 176,400 bytes per second. Therefore, if we simply pre-record the music and store it we would have just about enough for one third of a second of music.

And that's if we completely fill the entire 64KB demo with only music; we still need to put the graphics in there somewhere too! This is not a very good start.

So the obvious choice is: compress it! There are lots of cool music compression algorithms out there, the most famous and popular is MP3. It's not the best but it's not far off, either. MP3 gives us the ability to trade quality for storage space by changing the encoding bit-rate. An acceptable quality bit-rate is around 128Kbit per second. There are eight bits in a byte, so that makes 16KB per second, giving our demo a maximum of about four seconds of music.

Compression isn't working for us, time for a re-think.

We need to start thinking about a more procedural approach to the problem. We don't actually need to record all of the music, we just need to know what notes to play and when, like a piano-roll zooming through a Pianola. If we can invent a small piece of code that makes a sound like a piano when we tell it to "play note X at time T", then our music can be much smaller, just a series of time-stamps and notes to play.

There are two common ways of doing this, MOD and MIDI. MOD was invented almost 20 years ago to represent the particular needs of the Amiga computer's sound chip. It is a very popular format for demo authors and its utility cannot be understated. However, in making a format that conforms to the needs of the Amiga, we find ourselves today fighting against its constraints. For example, the quietest non-silent volume level in a MOD file is -36dB, which doesn't give us very much in the way of dynamic range. It's fine for electronic music, and indeed many demo soundtracks are amazing works of electronica, but it's quite poor for playing classical music on a piano.

Another constraint is that you have to write the music in "patterns", almost like programming the music using subroutines. Again, this is fine for electronic music which tends to be structured on repetitive themes allowing a small number of patterns to be reused throughout the song, but it's much less useful for a classical piece which is constantly evolving.

So let's have a look at MIDI instead.

When you think of MIDI you're probably thinking of the cheesy music files you get all over the internet and in mobile phone ring tones. But this is not exactly the kind of MIDI I'm talking about. Those files are more accurately called General MIDI, and the reason why they sound naff is because the General MIDI system defines a certain set of instruments, and they are not normally well reproduced on PC soundcards.

What we're more interested in is the "old school" MIDI, which is nothing more than a simple stream of notes and other commands telling a musical instrument what to do. General MIDI at its heart is the same thing, but by stripping-away the extra General MIDI stuff that we aren't interested in, we arrive at a more compact representation of the music. But is it compact enough?

Let's see with an example. I chose the Moonlight Sonata by Beethoven, the full version in three parts which is a little over 15 minutes long. Downloading the General MIDI version off the internet, it weighs-in at 67KB. Trying to ZIP it will give us a reasonable idea of how it will compress in the final demo. Moonlight Sonata ZIPs down to 20KB. I can also strip-away the commands I'm not interested-in to see how that affects the file size, and the result is 65KB uncompressed, 18KB compressed.

That's really encouraging, finally we've found something that fits into 64KB, but we can do better. You see, the problem with MIDI is that it's one single sequence of commands. The note-on commands get jumbled together with the note-offs, tempo changes, sustain pedal on/offs, and so on. Compression algorithms such as ZIP don't enjoy this kind of heterogeneous melange; they prefer instead to have long homogeneous sequences of the same information repeating over and over.

For example if you have a song with a drum track which repeats over and over at exactly the same tempo for the entire song, ZIP compression can pack it down to into a small initial pattern, and then just say "...and repeat that 50 times". But if the other instruments are all mixed-together into the same stream, this is impossible.

So what we can do is to break-up our single MIDI sequence into a group of sequences, one for the note-on commands, one for the note-offs, and so on. This makes the data much more predictable, and thus easier to compress. It also makes the data smaller because we no longer need to name the command every time. In MIDI parlance, this is called "running status", and by splitting the commands out into separate streams we've effectively created the ultimate in running status.

Taking our Moonlight Sonata MIDI from earlier and doing this, we end up with a file that is somewhat smaller at 54KB, and it ZIPs down to 16KB.

That's a really good result for almost sixteen minutes of non-repetitive music!

Additionally, there is still plenty of room for experimentation to try and get the file size down even smaller. Depending on the music we eventually choose, we might have a very rhythmic set of bass notes, which could benefit from being broken-out into separate tracks. And the same goes for arpeggiated melodies. The goal is always to isolate repetitive patterns into separate streams to give the compression algorithm a better chance.

Another experiment is to have the note numbers stored as delta-values instead of absolute. What does this mean? Well, if for example we have a middle C major chord (C, E, G) then the MIDI note numbers are 60, 64, 67. Rather than storing these absolute values we can instead store the differences between the note numbers: 0, 4, 3. This is known as delta encoding. If the notes are frequently evenly spaced (e.g. lots of chords, arpeggios) then our delta-encoded notes form into very regular patterns, and are thus easier to compress.

Each command comes in two parts: the time when it should occur, followed by information about the command. Another experiment we can try is to split these two pieces of information into two separate streams of information. This will give us two homogeneous streams rather than one heterogeneous one, which ought to compress better.

None of these experiments are worth trying right now because the results will depend heavily upon the music we finally choose for the demo. So we will put them aside for the time being and remember them for later.

The end result isn't recognisably MIDI any more and the software we must write to handle all of these streams playing together will be larger and more complex than a simple MIDI sequencer, but we can address those issues later. The important thing is that we've found a way to make music fit into our 64KB demo without compromising on quality.

In the next article I'll start talking about one of the other key problems in writing this piano synthesizer: How do you make a convincing noise like a piano in such a small amount of space?

4 comments:

JC Barnett said...

Interesting stuff! What is your predicted timeline for this project? Dying to see/hear some results.

Alistair said...

I've currently got a decent-sounding piano which plays music in realtime. However it's still too big and too slow, it only-just plays in realtime. So I'm in the optimization phase of the project now, and mucking around with the tone of the piano to try and make it sound better.

I've got a recording of it playing the first part of Moonlight, I just need to find a way of hosting that sound file somewhere slightly more anonymous than on my own website :)

JC Barnett said...

When did who decide "Moonlight Sonata" was to be the "Quick brown fox" of anything to do with digital music? I missed that meeting. So many more interesting pieces to choose from...grumble.

Alistair said...

No idea to be honest. I only chose it because the pace and style fits the mood of the demo, so compression experiments are likely to yield more meaningful results.

I think eventually I'll go with one of Mozart's piano sonatas, or perhaps the Maple Leaf Rag just for a laugh :-)