In section 9-14 of Understanding the Apple ][, Jim Sather says, “any even address could be used to load data from the data register to the MPU, although $C088 ... would be inappropriate.” It might be considered inappropriate because of the one-second window noted previously, but that’s exactly how the program Mr. Do! uses it. By reading from $C088, the program is able to issue the motor off instruction, and fetch the data at the same time. It is compact and useful for anti-debugging.
Faster pussycat
Another kind of race condition revolves around how quickly the data can be read from the disk. Borrowed Time, for example, reads an entire track in one revolution. In an interview for the Open Apple podcast, Rebecca Heineman says that she performs the decoding while the seek is in progress. While this is certainly possible, it would incur the significant overhead of having to store all 16 of the two-bit arrays—a total of 1.3kB! — before any decoding could occur. Of course, this is not what was done. Instead, each sector is read individually, but the denibbilisation is interleaved with the read. It means that the sector is decoded directly into memory, with only 86 bytes of overhead for a single two-bit array, and the use of two tables of 106 bytes and 256 bytes respectively. It is obviously fast enough to catch the next sector that arrives
The code looks like this, after validating the data field prologue:
0946 LDY #$AA
;zero rolling checksum
0948 LDA #0
094A STA $26
;wait for nibble to arrive
094C LDX $C0EC
094F BPL $94C
;index into table of offsets
of structures
0951 LDA $A00,X
;store offset
0954 STA $200,Y
;update rolling checksum
0957 EOR $26
;fetch 86 times
0959 INY
095A BNE $94A
095C LDY #$AA
095E BNE $963
;store decoded value
0960 STA $9F55,Y
;wait for nibble to arrive
0963 LDX $C0EC
0966 BPL $963
;update rolling checksum
0968 EOR $A00,X
;fetch structure offset,
;bits 0-1
096B LDX $200,Y
;merge first member of two-bit
;structure with six-bit value
;to recover eight-bit value
096E EOR $B00,X
;loop 86 times
0971 INY
0972 BNE $960
;save 85th value for last
0974 PHA
;clear low two bits
0975 AND #$FC
0977 LDY #$AA
;wait for nibble to arrive
0979 LDX $C0EC
097C BPL $979
;update rolling checksum
097E EOR $A00,X
;fetch structure offset,
;bits 2-3
0981 LDX $200,Y
;merge second member of
;two-bit structure with
;six-bit value to recover
;eight-bit value
0984 EOR $B01,X
;store decoded value
0987 STA $9FAC,Y
;loop 86 times
098A INY
098B BNE $979
;wait for nibble to arrive
098D LDX $C0EC
0990 BPL $98D
;clear low two bits
0992 AND #$FC
0994 LDY #$AC
;update rolling checksum
0996 EOR $A00,X
;fetch structure offset,
;bits 4-5
;offset -2 to account for Y+2
0999 LDX $1FE,Y
;merge third member of two-bit
;structure with six-bit value
;to recover eight-bit value
099C EOR $B02,X
;store decoded value
099F STA $A000,Y
;wait for nibble to arrive
09A2 LDX $C0EC
09A5 BPL $9A2
;loop 84 times
09A7 INY
09A8 BNE $996
;clear low two bits
09AA AND #$FC
;update rolling checksum
09AC EOR $A00,X
;restore slot to X
09AF LDX $2B
;retry if checksum mismatch
09B1 TAY
09B2 BNE $9BD
;wait for nibble to arrive
09B4 LDA $C0EC
09B7 BPL $9B4
;check only 1st epilogue byte
09B9 CMP #$DE
09BB BEQ $9BF
09BD SEC
09BE .BYTE $24
09BF CLC
;store 85th decoded value
09CO PLA
09C1 LDY #$55
09C3 STA ($44),Y
09C5 RTS
The exact way in which the technique works is as follows. First, each of the two-bit values is read into memory, but instead of storing them directly, the values are used as an index into the 106-byte table. The 106-byte table serves two purposes. The first, in the context of the two-bit values, is as an array of offsets within the 256-byte table. The second, in the context of the six-bit values, is as an array of pre-shifted values for the six-bit nibbles. The 256-byte table is composed of groups of two-bit values in all possible combinations for each of the three positions in a nibble. To produce the eight-bit value, each of the pre-shifted six-bit values is ORed with the corresponding two-bit value. It is unknown why the 85th value is treated separately from the rest in that code; it could certainly be decoded at the same time, saving five lines.
With the benefit of determination to improve it, and the ability to do so, I rewrote this loader to decode all of the bytes directly, reduced the size of the code, and made it even faster. I call it “Oboot.”49 Then I reduced the overhead to just two bytes, if page $BF is not the destination. I call that one “qboot.”50 The two tables are still 106 bytes and 256 bytes respectively. It might appear that the second table can be reduced to 192 bytes, since the other 64 bytes are unused. However, it is not possible for this algorithm, because the alignment is required to supply the pre-shifted values. If the table were reduced in size, then additional operations would be required to reproduce the effect of the shift, and which would take longer to execute than the time available before the next nibble arrived.
Interestingly, Heineman claims to have created and released the technique in 1980,51 but it was apparently not until 1984 that she used it in a release herself. It certainly existed in 1980, though. Automated Simulations (which later became Epyx) included the technique with the programs Hellfire Warrior and Rescue At Rigel. In 1983, Free Fall Associates52 included the technique with the programs Murder on the Zinderneuf and Archon. (Apparently they took it with them, as Epyx did not use it again.) Also in 1983, Apple included the technique in ProDOS. In 1985, Brøderbund included the technique with the program Captain Goodnight. According to Roland Gustafsson, Apple supplied that code.53
Also interestingly, whoever included it in the Free Fall Associates programs either did not understand it, or just did not want to touch it—there, the loader has been patched to require page-aligned reads, but the code still performs the initialisation for arbitrary addressing. Twelve lines of code could have been removed from that version. The Interplay programs that use the technique also require page-aligned reads, but do not have the unnecessary initialisation code.
As Olivier Guinart notes, “It’s ironic that the race condition would be used by a program called Borrowed Time."
10:7.3 Track-level protections
Track length
The length of a track might not be constant across all of the tracks on a disk. The speed of the drive is the primary reason: the faster the drive, the shorter the track. Fewer nibbles can be written because of the larger gaps between the nibbles.
Wizardry determines the length of the track, by measuring the time between succeeding arrivals of sector zero, and then calculates the d
eviation from the expected value. This deviation value is applied to the length of several other tracks, and the result is compared against the expected lengths. If the length of the track is not within the range that is expected, then the program hangs. This protection cannot be reproduced by a sector-copier or track-copier, because they will discard the original data between the sectors, thus altering the length of the track. A bit-copier can usually reproduce this protection because it writes the entire track mostly as it appeared originally, so the track length is at least similar to the original.
Track positioning
The stepper motor in the Disk ][ is composed of four magnets. To advance a whole track requires activating and deactivating two phases in the proper order, and with a sufficient delay, for each track to step. To step to a later track, the next phase must be activated while the other phases are deactivated. To step to an earlier track, the previous phase must be activated while the other phases are deactivated. As might be expected, activating and then deactivating only one of the phases will cause the stepper to stop half-way between two tracks. This is a half-track position. It is even possible to produce quarter-track stepping reliably, by performing the half-track stepping method, but with a smaller delay. Depending on the hardware, it can also be done by activating two of the phases, and then deactivating only one of them. This last technique is used by Spiradisc. (§10:7.3.)
The issue with half-track and quarter-track positioning is that data written to these partial track positions will cause signal interference with data written to the neighbouring half-track or quarter-track at the same relative position. To avoid unintentional cross-talk, data can be written to only part of the track such that there is no overlap, or placed at least three-quarters of a track apart. (The reliability of three-quarter tracks is questionable.)
The maximum amount of data that can be placed at partial-track intervals is proportional to the stepping—a quarter of a track for each of four consecutive quarter-tracks, half of a track for each of two consecutive half-tracks, or a full track for consecutive three-quarter-tracks. There can be a significant performance hit to access the data, too—it requires an almost complete rotation to reach the start of the data on subsequent tracks if the maximum density is used, because the seek time is long enough that the start will be missed on the first time around. As a result, the most common amount that is used is only a quarter of the track, and placed far enough around the track that the read can be performed almost continuously. Programs that make use of partial tracks usually include a standard format of individual sectors, so the only trick to the protection is the location of the data on the disk.
Agent USA uses the half-track technique with five sectors per track.
Championship Lode Runner uses an alternating quarter-track technique with just two sectors per track but of twice the size. While loading, the access alternates between the neighbouring quarter-tracks, resulting in the drive chattering, but allowing the sectors to be spaced only half of a rotation apart. In both cases of the programs here, it results in an extremely fast load time because of the reduced head movement.
In this case, the protection is the use of partial tracks. Copy programs which do not copy the partial tracks (and copying partial tracks is not the default behavior) will fail to reproduce the protection.
Synchronised tracks
If the approximate rotation speed of the drive is known, then it becomes possible to place sectors at specific locations on tracks, such that they have a special position relative to sectors on other tracks. This technique is identical to synchronized sectors, except that it spans tracks, making it even more difficult to reproduce, because it is difficult to determine the relative position of sectors across tracks. Unlike “spiral tracking” (§10:7.3), this technique limits itself to checking for the existence of particular sectors, rather than actually reading them.
Blazing Paddles uses this technique. Once it finds sector zero on track zero, as a known starting point, it seeks to track one, reads the address field of the next sector to arrive, and then compares it to an expected value. If the proper sector is found, then the program seeks to track two, reads the address field of the next sector to arrive, and compares it to an expected value. If the proper sector is found, then the program seeks to track three. This is repeated over eight tracks in total. It means that the original disk has one sector placed at a specific location on each of eight consecutive tracks, relative to sector zero of track zero, such that it factors in how much the disk rotates during the time that the controller takes to move the head from track zero. It also supports slight variations in rotation speed, such that the read can begin anywhere after the address field for the previous sector, without failing the protection.
Track spiralling
"Track spiralling” or “spiral tracking” is a technique whereby the data is placed in partial-track intervals, but treated as a complete track. By measuring the time to move the head to a partial-track, the position on the track can be known, such that the next sector to be read will have a predictable number, and therefore can be read without validation, once the start of the sector is found. A copy of the disk will not place the data at the same relative position, causing the protection to fail. The stepping in spiral tracking goes in only one direction. A visualisation of the data access would look like a broken spiral, hence the name.
One major problem with spiral tracking is that variations in rotation speed can result in the read missing its queue and not finding the expected sector. For thirty years, I believed a claim that the program Captain Goodnight uses this technique.54 It doesn’t. The Observatory uses a spiral pattern for faster loading, but still verifies the sector number first. However, the program LifeSaver uses true spiral tracking.
Track arcing
“Track arcing” uses the same principle as spiral tracking, but instead of stepping in only one direction, it reaches a threshold and then reverses direction.
Track mirroring
Track mirroring should be placed conceptually between synchronized tracks and spiral tracking. As with synchronized tracks, it expects a particular sector to be found after stepping across multiple tracks. As with spiral tracking, it reads the sector data. However, unlike spiral tracking, it verifies that the contents of that sector match exactly the contents of all of the other sectors that are synchronized similarly across the tracks.
The Toy Shop uses this technique. It reads three consecutive quarter-tracks in RWTS18 format, and verifies that they all fully readable and have a valid checksum. This is possible only because they are identical in their content and position. The contents of the last quarter-track are used to boot the program. A funny thing occurs when the program is converted to a NIB image: the protection is defeated transparently, because NIB images do not support partial tracks, so the attempt to read consecutive quarter-tracks will always return identical data, exactly as the protection requires!
Pinball Construction Set uses this technique. It reads a sector then activates a phase to advance the head, and then proceeds to read a sector while the head is moving. The head continues to drift over the track while the sector is being read. After reading the sector, the program deactivates the phase, reads another sector, and then completes the move to the next track. Once there, it reads a sector. It activates a phase to retreat the head, and then performs the same trick in reverse, until the start of the track is reached again. It performs this sequence four times across those two tracks, which makes the drive hiss. The program is able to read the sector as continuous data because the disk has consecutive quarter-tracks that are identical in their content and position.
Cross-talk
While cross-talk is normally something to be avoided, it can serve as a copy-protection mechanism, by intentionally allowing it to occur. It manifests itself in a manner similar to the effect of having excessive consecutive zero-bits being present in the stream, where reading the same stream repeatedly will yield different values. The lack of
such an effect indicates the presence of a copy.
More tracks
Many disk drives had the ability to seek beyond track 34, and many disks also carried more than 35 tracks. However, since DOS could not rely on the presence of either of these things, it did not offer support for them. Some copy programs did not support the copying of additional tracks for the same reason. Of course, programmers who did not use DOS had no such limitation. While the actual number of available tracks could vary up to 40 or even 42, it was fairly safe to assume that at least one track existed, and could be read by direct use of the disk drive.
Faial uses this technique to place data on track 35.
SpiraDisc
No description of copy-protection techniques could be complete without including SpiraDisc. This program was a protection technology that introduced the idea of spiral tracking, though the implementation is not spiral tracking as we would describe it today. It is, in fact, a precise placement of multiple sectors on quarter-tracks, such that there is no cross-talk while reading them, but without a specific order. The major deviation from the current idea of spiral tracking is that there is no synchronization of the sectors beyond avoiding cross-talk. The program will allow a complete rotation of the disk to occur, if necessary, while searching for the required sector.
The first-stage boot loader is a single sector that is 4-and-4 encoded, 768 bytes long. The second stage loader is composed of ten regular sectors that are 6-and-2 encoded. They are read one by one—there is no read-scattering here to speed up the process. Thereafter, reads use an alternative nibble table—all of the values from #$A9-FF from our first table. These values might have been chosen because they provide the least sparse array when used as indexes.
PoC or GTFO, Volume 2 Page 15