A 9-bit pitch technique
This article describes a new approach to representing pitch in C64 playroutines. The intended audience is people with a general understanding of C64 coding and SID playroutine design, although readers outside this somewhat narrow demographic may still find the article interesting.
Background
The SID chip in the C64 has, for each of the three voices, a 16-bit register that controls the oscillator frequency. This 16-bit value gets added into a 24-bit accumulator on every clock cycle, and the top bits of the accumulator are used as an index into the current waveform.
Traditionally, for speed and simplicity, playroutines work with integer notes, for instance in the range 0-95. A table is then used to translate each note into its corresponding 16-bit frequency value.
Such a lookup table can be prepared as follows, here with A440 at note number 60:
#define SID_CLOCK_HZ 985248 int freq[96]; void compute_freq() { int i; double semitone = pow(2, 1.0/12); for(i = 0; i < 96; i++) { freq[i] = round(440 * pow(semitone, i - 60) / SID_CLOCK_HZ * 0x1000000); } }
After the lookup, effects such as vibrato and glide are computed by adding a 16-bit offset to the frequency value. This approach is problematic, because the glide or vibrato will then affect the perceived pitch differently in different parts of the frequency range. Composers have to compensate for this, e.g. by selecting a greater vibrato depth for high-pitched notes.
In particular, suppose you have a sound patch—or instrument—that involves a rapid arpeggio between two notes that are an octave apart. You are currently playing a sustained sound that arpeggiates between C-4 and C-5, and you wish to glide an octave down, so as to end up with a sound that arpeggiates between C-3 and C-4. This will not work using the above approach, because the same glide offset would have to be added to (or subtracted from, as it were) the frequency of each note; once you reach the point where the upper note has become a C-4, the lower note will be at F#3.
To get rid of these problems, we could instead perform all computations on linear pitch values, perhaps using 8.8 fixed-point math. In this model, effects are applied to the 16-bit pitch value, and in a final step the pitch is converted to the corresponding frequency by looking up—in the same table as above—the values corresponding to the closest note below and above the desired pitch, and then performing a linear interpolation between the two frequencies based on the fractional part of the pitch. This creates a piece-wise linear approximation of an exponential curve, which coincides with the ideal curve at the semitones.
However, since the CPU used in the C64 lacks a multiplication instruction, the interpolation step described above is costly in terms of clock cycles.
This article describes a novel way of working with pitch values, that is highly efficient in practice.
Pitch as a 9-bit quantity
Let me start with a somewhat controversial proposition:
Two bits of fractional pitch are sufficient.
That is to say, we can represent the usual range of 96 semitones with a fixed-point number in 7.2-bit form or, equivalently, a 9-bit integer expressing the pitch in microtones, where four microtones correspond to one semitone.
The claim is controversial, because if we listen closely to a slow glide over a large interval, let's say an octave, the quantisation is audible. But in my experience, enough precision is retained for glide and vibrato effects to sound good. You can evaluate this yourself, because the technique is used in Scene Spirit. Hopefully you'll agree that the glides and vibratos sound quite normal.
The next step is to generate a large table that supplies a frequency value for each of the 96 * 4 = 384 microtones. Values from this table are then copied verbatim into the SID registers.
The code for precomputing the table might look like this:
#define SID_CLOCK_HZ 985248 int freq[96 * 4]; void compute_freq() { int i; double microtone = pow(2, 1.0/48); for(i = 0; i < 96 * 4; i++) { freq[i] = round(440 * pow(microtone, i - 60 * 4) / SID_CLOCK_HZ * 0x1000000); } }
But wait a minute. How can 9-bit math be efficient on an 8-bit processor?
Here is another proposition:
Pitch may be conveniently represented as a sum.
This is less controversial. For instance, a playroutine might compute pitch as the sum of the current note, track transpose, instrument transpose, detune, arpeggio, vibrato and glide. We can group these terms into two 8-bit quantities, which are added together at the very last step to produce a 9-bit sum.
Each subsum is of course limited to an 8-bit range (64 semitones). The question is then, how can we expose this constraint in a natural way to the composer?
A table of pitch offsets
In the playroutine and tracker that I developed for Scene Spirit, there is a notion of a current instrument and a current arpeggio for each voice. The instrument is responsible for maintaining the current base pitch, which is one of these 8-bit quantities. For leads, it is computed relative to the note being played, and for e.g. some drums it is rapidly decremented from a high to a low value, producing a glide effect. Independently, the current arpeggio provides the offset pitch, which is the other 8-bit quantity. The offset pitch is looked up in a global offset table based on the current arpeggio position for the voice.
The offset table is quite versatile: The default arpeggio consists of a fixed value of $70. This effectively selects a window of pitches (semitones 28-91) that are available for use by tracks and instruments. Whenever the bass instrument plays, an arpeggio with a fixed value of $10 is selected instead, moving the window two octaves down, effectively reconfiguring the voice to use another set of pitches (semitones 4-67).
To create a major chord, we would place the following sequence of offsets in the table: $70, $5c, $50, $40. The playroutine would then move the current arpeggio position through each of these table entries in turn.
And since we are working with microtones, a vibrato can be regarded as a special kind of arpeggio. The offset sequence would then be something like $70, $71, $72, $72, $71, $70, $6f, $6e, $6e, $6f.
Finally, the playroutine supports one-shot arpeggios (meaning that they stop at the final value, rather than repeat), so we can also use the offset table to create short glides up to a target note. For instance, to approach a note from one semitone below, we'd just select a one-shot arpeggio with the values $6c, $6d, $6e, $6f, $70.
Converting from pitch to frequency
The summing of the two 8-bit quantities is performed very efficiently as part of the final table lookup, shown here for voice 1:
voice1_arppos = * + 1 ldx arptable voice1_base1 = * + 1 lda freq_lsb,x sta $d400 voice1_base2 = * + 1 lda freq_msb,x sta $d401
The base pitch is kept in the least significant byte of the instruction operand of each lda instruction, so it must be written twice whenever it changes. The current arpeggio position is kept in the least significant byte of the ldx operand. All three tables must be page-aligned.
Compared to the interpolation-based technique outlined above, this way of computing a frequency value from a linear pitch is extremely fast, with a worst case execution time of 22 clock cycles. It is used in Scene Spirit, along with many other tricks, to achieve a total worst case time of 14 rasterlines for the entire playroutine.
Of course, the microtone frequency table is quite big. With 384 entries of 16 bits each, it occupies a full three pages (with an inconvenient half-page gap in the middle). This should be compared to the traditional 96-entry table, which occupies 3/4 of a page. Is there any way around this?
Folding the table
As a first step, consider what happens if we skip every other entry in the frequency table. We also split the conversion procedure into two cases, based on the least significant bit of the 9-bit pitch value. When the least significant bit is zero, we perform a regular lookup based on the remaining eight bits. If it is one, we compute the average of the two table entries surrounding our desired pitch. This is effectively linear interpolation with a single-bit parameter.
Here is a straightforward way of doing it:
lda base_pitch clc adc offset_pitch ror tax bcc fractional_0 fractional_1 lda freq_lsb,x clc adc freq_lsb+1,x sta temp lda freq_msb,x adc freq_msb+1,x ror sta $d401 lda temp ror sta $d400 jmp done fractional_0 lda freq_lsb,x sta $d400 lda freq_msb,x sta $d401 done
But we can do better. Notice how we are computing the average as:
(freq[i] + freq[i + 1]) / 2
Analytically, this is equivalent to:
freq[i] / 2 + freq[i + 1] / 2
Sure, due to rounding we lose one bit of precision, but this is negligible compared to the error we are introducing by making a piece-wise linear approximation in the first place.
Next, observe that the halved values we want are actually stored in the table, exactly one octave (24 entries) earlier!
freq[i - 24] + freq[i + 1 - 24]
In practice, we have to extend the table with an extra octave towards the bottom, and adjust our lookups accordingly. Then we get:
lda base_pitch clc adc offset_pitch ror tax bcc fractional_0 fractional_1 lda freq_lsb,x clc adc freq_lsb+1,x sta $d400 lda freq_msb,x adc freq_msb+1,x sta $d401 jmp done fractional_0 lda freq_lsb+24,x sta $d400 lda freq_msb+24,x sta $d401 done
Finally, we can extend this trick to consider both of the fractional bits. The linear interpolation is a weighted sum according to the following table:
Bits | First weight | Second weight |
---|---|---|
00 | 100% | 0% |
01 | 75% | 25% |
10 | 50% | 50% |
11 | 25% | 75% |
We have seen that we can scale a frequency by 50% by going back one octave. Clearly, we can scale by 25% by going back two octaves. But what about 75%?
From music theory we know that a perfect fifth corresponds to a frequency ratio of 3:2 = 1.5. In equal temperament, this is then approximated by seven semitones, with an actual ratio of 1.4983. But this means that if we go seven semitones up, and then one octave down (for a net movement of five entries down the table), we should obtain a frequency that is very close to 3:4, or 75%, of the original entry.
The code grows a bit, because we now have four cases. The table also needs to be extended with a further octave to accomodate the 25% lookup. But essentially, we're back to the traditional frequency table with one entry per semitone. Here's the final routine:
lda base_pitch clc adc offset_pitch ror bcc fractional_x0 fractional_x1 lsr tax bcc fractional_01 fractional_11 lda freq_lsb,x clc adc freq_lsb+19+1,x sta $d400 lda freq_msb,x adc freq_msb+19+1,x sta $d401 jmp done fractional_01 lda freq_lsb+19,x ;clc adc freq_lsb+1,x sta $d400 lda freq_msb+19,x adc freq_msb+1,x sta $d401 jmp done fractional_x0 lsr tax bcc fractional_00 fractional_10 lda freq_lsb+12,x clc adc freq_lsb+12+1,x sta $d400 lda freq_msb+12,x adc freq_msb+12+1,x sta $d401 jmp done fractional_00 lda freq_lsb+24,x sta $d400 lda freq_msb+24,x sta $d401 done
And here's the code for generating the table. Note that we have to include an extra entry at the end, to support interpolation at the top of the range. With a total of 121 entries, this table occupies slightly less than one page of RAM.
#define SID_CLOCK_HZ 985248 int freq[24+96]; void compute_freq() { int i; double semitone = pow(2, 1.0/12); for(i = 0; i <= 24 + 96; i++) { freq[i] = round(440 * pow(semitone, i - 24 - 60) / SID_CLOCK_HZ * 0x1000000); } }
The worst case execution time for the above routine, if we pretend that base_pitch and offset_pitch are encoded in immediate operands like in the original code snippet, is 46 cycles.
Given that we have three voices, the net change in worst case execution time is (46 - 22) * 3 = 72 cycles, a little over one rasterline. But that's a fair price for replacing a three-page table with one that fits in a single page.
Example
This C64 program (source) performs a glide across the full 96-semitone range using the code above, at a rate of one microtone per video frame. Note how the perceived glide rate remains constant.
Concluding remarks
In this article, we have seen that it is feasible to work with 9-bit linear pitch values in a C64 playroutine, by representing them as two 8-bit values that are added together in a final step, right before the conversion to a 16-bit frequency.
We have seen how the final conversion could be made very quick with the help of a large table, and how this table can be reduced to a more typical size at the cost of one additional rasterline of execution time. By considering both of these implementations, the playroutine coder is free to decide whether maximum speed or minimum size is preferable for a given application. Crucially, this decision may be postponed until after a piece of music has been composed, when the total rastertime and song size are known.
One could even write a playroutine in which the code for the final conversion step may be dynamically loaded and switched while a song is playing. This would allow the same song to accompany both memory-hungry and cycle-hungry parts in the same trackmo, even if the exact time of each transition is decided at runtime.
Posted Monday 30-Mar-2015 22:44
Discuss this page
Disclaimer: I am not responsible for what people (other than myself) write in the forums. Please report any abuse, such as insults, slander, spam and illegal material, and I will take appropriate actions. Don't feed the trolls.
Jag tar inget ansvar för det som skrivs i forumet, förutom mina egna inlägg. Vänligen rapportera alla inlägg som bryter mot reglerna, så ska jag se vad jag kan göra. Som regelbrott räknas till exempel förolämpningar, förtal, spam och olagligt material. Mata inte trålarna.
Wed 1-Apr-2015 05:35
Linus Åkesson
Wed 1-Apr-2015 08:00
jaymz julian
Mon 13-Apr-2015 07:35
Sun 23-Apr-2017 23:53
Sun 20-Sep-2020 11:15