Encryption and compression
Encryption (or, more correctly, enciphering) of code was a popular technique, but the keys were always very weak. The enciphering usually consisted of an exclusive-OR of the byte with a fixed key. In some cases, the key was a rolling value taken from the byte just deciphered. In some rarer cases, multiple keys were used.
Goonies uses a rotate operation. However, since the 6502 CPU does not have a plain rotate instruction—only rotate with carry — the program must set the carry bit correctly prior to the operation. The program does it this way:
Compression of graphics was necessary to reduce the size of the data on disk, and to decrease load times, since the reduced disk access more than made up for the time spent to decompress the graphics. The most common compression technique was Run-Length Encoding (RLE), using a stream derived from every second horizontal byte, or vertical columns. More advanced compression, such as something based on Lempel-Ziv, was generally considered to be too slow to use.
Perhaps based on the assumption that LZ-based compression was too slow, compression of code seems to have been entirely absent until recently—all of my releases use my decompressor for aPLib,55 for an almost exact or even slightly reduced load time, which shows that the previous assumption was quite wrong. Others have had success with my decompressor for LZ456 when used for graphics. A more recent LZ4-based project is also showing promise.57
10:7.8 Virtual machines
One of the most powerful forms of obfuscation is the virtual machine. Instead of readable assembly language that we can recognise, the virtual machine code replaces instructions with bytes whose meaning might depend on the parameters that follow them. Electronic Arts were famous for their use of pseudo-code (p-code) to hide the protection routines in programs such as Archon and Last Gladiator. That virtual machine was even ported to the Commodore 64 platform.
Last Gladiator uses a top-level virtual machine that has 17 instructions. The instructions look like this:
00 JMP
01 CALL NATIVE
02 BEQ
03 LDA IMM
04 LDA ABSOLUTE
05 JSR
06 STA ABSOLUTE
07 SBC IMM
08 JMP NATIVE
09 RTS
;p-code A register
0A LDA ABSOLUTE, A
0B ASL
0C INC ABSOLUTE
0D ADC ABSOLUTE
0E X0R ABSOLUTE
0F BNE
10 SBC ABSOLUTE
11 MOVS
It has the ability to transfer control into 6502 routines, via the instructions that I named “call native” and “jmp native.” The parameters to the instructions were XORed with different values to make the disassembly even more difficult. Since the virtual machine could read arbitrary memory, it was used to access the soft-switches, in order to turn the drive on and off. Once past the first virtual machine, the program ran a second one. The second virtual machine is interesting for one particular reason. While it looks identical to the first one, it’s not exactly the same. For one thing, there are only thirteen instructions. For another, two of them have swapped places:
These two engines were not the only ones that Electronic Arts used, either. Hard Hat Mack uses a version that had twelve instructions.
00 JMP
01 CALL NATIVE
02 BEQ
03 LDA IMM
04 LDA ABSOLUTE
05 JSR
06 STA ABSOLUTE
07 SBC IMM
08 JMP NATIVE
09 RTS
;p-code A register
0A LDA ABSOLUTE, A
0B ASL
Following that virtual machine was yet another variation. This one has only eleven instructions. Nine of the instructions are identical in value to the previous virtual machine. The differences are that “ASL” is missing, and the “LDA ABSOLUTE, A” instruction is now “INC ABSOLUTE.”
However, in between those two virtual machines was an entirely different virtual machine. It is a stack-based engine that uses function pointers instead of byte-code. It looks like this, if you’ll forgive handler address in place of names I wasn’t able to identify.
9DF2 .WORD xsave_retpc
9DF4 .WORD xpush_imm
9DF6 .WORD $95FF
9DF8 .WORD xpush_imm
9DFA .WORD $A600
9DFC .WORD xchkstk_vars
9DFE .WORD xbeq_rel
9E00 .WORD 4
9E02 .WORD xdo_copy_prot
9E04 .WORD xjmp_retpc
This virtual machine is Forth. Amnesia, including its copy-protection (What You Know style), was written entirely in Forth. The Toy Shop used another virtual machine, which combined byte-code and function pointers, depending on which function was called, and all mixed freely with native code. Its identity is not known.
Of course, the most famous of all virtual machines is the one inside Pascal, an ancestor of Delphi that was very widely used in the eighties. Wizardry is perhaps the most well-known Pascal program on the Apple ][ system, and the Pascal virtual machine made it a simple task to port the program to other platforms. The advantage of a virtual machine is that only the interpreter must be ported, rather than the entire system. Since the language is much higher-level than assembly language, it also allows for a faster development time. It also makes de-protecting a program much harder.
10:7.9 ROM regions
The Apple ][ ROM BIOS is full of little routines whose intention is clear, but whose meaning can be changed depending on the context. That leads into an interesting area of obfuscation and indirection. For our first example, there is a routine to save the register contents. It is used by the ROM BIOS code when a breakpoint occurs. It has the side-effect of returning the status register in the A register. That allows a program to replace the instruction pair PHP; PLA with the instruction JSR $FF4A for the same primary effect (it has the side-effect of altering several memory locations), but one byte larger.
For our second example, there is a routine to clear the primary text screen. Since the Apple ][ has a text and graphics mode that share the same memory region, there is one routine for clearing the screen while in text mode, and another for clearing the screen while in graphics mode. However, it is possible to use the graphics routine to clear the screen even while in text mode. That allows a program to replace JSR $FC58 with JSR $F832 for the same major effect. (It has the side-effect of altering several memory locations.)
For our third example, there is a routine to compare two regions of memory. It is used primarily to ensure that memory is functioning correctly. However, it can also be used to detect alterations that as those produced by a user attempting to patch a program. All that is required is to set the parameters correctly, like this:
LDA #> beghi
STA $3D
LDA #< beglo
STA $3C
LDA #> endhi
STA $3F
LDA #< endlo
STA $3E
LDA #> cmphi
STA $43
LDA #< cmplo
STA $42
JSR $FE36
For our fourth example, there is an RTS instruction at a known location. A jump to this instruction will simply return. It is usually used to determine the value of the Program Counter. However, it can just as easily be used to hide a transfer of control, taking into account that the destination address must be one less than the true value, like this to jump to $200:
LDA #$01
PHA
LDA #$FF
PHA
JMP $FF58
And so on. The first three examples are taken from Lady Tut, though in the third example, the parameters are also set in an obfuscated way, using shifts, increments, and constants. The fourth is taken from Mr. Do!.
10:7.10 Sensitive memory locations
There are certain regions in memory, in which modifications can be made which will cause intentional side-effects. The side-effects include code-destruction when viewed, or automatic e
xecution in response to any typed input, among other things. The zero-page is a rich source of targets, because it is shared by so many things.
The most commonly altered regions follow.
Scroll window
When the monitor is active, the scrollable region of the screen can be adjusted to allow “fixed” rows and/or columns. The four locations, left ($20), width ($21), top ($22), and bottom ($23) can also be adjusted. A program can protect itself from debugging attempts by altering these values to make a very small window, or even to cause overlapping regions that will cause memory corruption if scrolling occurs!
I/O vectors
There are two I/O vectors in the Apple ][, one for output—CSW ($36-37), and one for input—KSW ($38-39). CSW is invoked whenever the ROM BIOS routine COUT is called to display text. KSW is invoked whenever the ROM BIOS routine RDKEY is called to wait for user input. Both of these vectors are hooked by DOS in order to intercept commands that are typed at the prompt. Both of these vectors are often forcibly restored to their default values to unhook debuggers. They are sometimes altered to point to disk access routines, to prevent user interaction. Championship Lode Runner uses the hooks for disk access routines in order to load the level data from the disk.
Monitor
The monitor prompt allows a user to view and alter memory, and execute subroutines. It uses several zero-page addresses in order to do this. Anything that is stored in those locations ($31, $34-35, $3A-43, $45-49) will be lost when the monitor becomes active. In addition, the monitor uses the ROM BIOS routine RDKEY. RDKEY provides a pseudo-random number generator, by measuring the time between keypresses. It stores that time in $4E-4F.
Falcons uses address $31 to hold the rolling checksum, and checks if $47 is constant after initialising it.
Classmate uses addresses $31 and $4E to hold two of the data field prologue bytes.
The “LOCK” mystery
There is a special memory location in Applesoft ($D6) which is named the “AppleSoft Mystery Parameter” in What’s Where In The Apple. It is also named “LOCK” in the Applesoft Internals disassembly, which gives a better idea of its purpose. When set to #$80, all Applesoft commands are interpreted as meaning “RUN.” This prevents any user interaction at the Applesoft prompt. Tycoon uses this technique.
Stack
The stack is a single 256-byte page ($100-1FF) in the Apple ][. Since the standard Apple ][ environment does not have any source of interrupts, the stack can be considered to be a well-defined memory region.
This means that code and data can be placed on the stack, and run from there, without regard to the value of the stack pointer, and modifications will not occur unexpectedly. (The effect on the stack of subroutine calling is an expected modification.) If an interrupt occurred, then the CPU would save the program counter and status register on the stack, thus corrupting the code or data that existed below the current stack pointer. (The corruption can even be above the stack pointer, if the stack pointer value is low enough that it wraps around!) Correspondingly, any user interaction that occurs, such as breaking to the prompt, will cause corruption of the code or data that exist below the current stack pointer. Choplifter uses this technique.
Stack pointer
Since the standard Apple ][ environment does not have any source of interrupts, the stack pointer can be considered to be a register with well-defined value. This means that its value remains under program control at all times and that it can even be used as a general-purpose register, provided that the effect on the stack pointer of subroutine calling is expected by the program. Beer Run uses this technique.
LifeSaver also uses this technique for the purpose of obfuscating a transfer of control—the program checksums the pages of memory that were read in, and then uses the result as the new stack pointer, just prior to executing a “return from subroutine” instruction. Any alteration to the data, such as the insertion of breakpoints or detours, results in a different checksum and unpredictable behavior.
Input buffer
The input buffer is a single 256-byte page ($200-2FF) in the Apple ][. Code and data can be placed in the input buffer, and run from there. However, anything that the user types at the prompt, and which is routed through the ROM BIOS routine GETLN ($FD6A), will be written to the input buffer. Any user interaction that occurs, such as breaking to the prompt, will cause corruption of the code in the input buffer. Karateka uses this technique.
Primary text screen
The primary text screen is a set of four 256-byte pages ($400-7FF) in the Apple ][. Code and data can be placed in the text screen memory, and run from there. The visible screen was usually switched to a blank graphics screen prior to that occurring, to avoid visibly displaying garbage, and perhaps causing the user to think that the program was malfunctioning. Obviously, any user interaction that occurs through the ROM BIOS routines, such as breaking to the prompt and typing commands, will cause corruption of the code in the text screen. Joust uses this technique to hold essential data.
Non-maskable interrupt vector
When a non-maskable interrupt (NMI) occurs, the Apple ][ saves the status register and program counter onto the stack, reads the vector at $FFFA-FFFB, and then starts executing from the specified address. The ROM BIOS handler immediately transfers control to the code at $3FB-3FD, which is usually a jump instruction to the complete NMI handler. For programs that were very heavily protected, such that inserting breakpoints was difficult because of hooked CSW and KSW vectors, for example, one alternative was to “glitch” the system by using a NMI card to force a NMI to occur. However, that technique required direct access to memory in order to install the jump instruction at $3FB-3FD, since the standard ROM BIOS does not place one there.
On a 64kb Apple ][, the ROM BIOS could be copied into banked memory and made writable. The BIOS NMI vector could then be changed directly, potentially bypassing the user-defined NMI vector completely.
Reset vector
On a cold start, and whenever the user presses Ctrl-Reset, the Apple ][ reads the vector at $FFFC-FFFD, and then starts executing from the specified address. If the Apple ][ is configured with an Autostart ROM, then the warm-start vector at $3F2-3F3 is used, if the “power-up” byte at $3F4 matched the exclusive-OR of #$A5 with the value at $3F3.58 The values at $3F2-3F4 are always writable, allowing a program to protect itself against a user pressing Ctrl-Reset in order to gain access to the monitor prompt, and then saving the contents of memory. The typical protected program response to Ctrl-Reset was to erase all of memory and then reboot.
On a 64kb Apple ][, the ROM can be copied into banked memory and made writable. When the user presses Ctrl-Reset on an Apple ][+, the ROM BIOS is not banked in first, meaning that the cold-start reset vector can be changed directly, and will be used, potentially bypassing the warm-start reset vector completely. On an Apple ][e or later, the ROM BIOS is banked in first, meaning that the modified BIOS cold-start reset vector will never be executed, and so the warm-start reset vector cannot be overridden.
Interrupt request vector
Despite not having a source of interrupts in the default configuration, the Apple ][ did offer support for handling them. When an interrupt request (IRQ) occurs, the Apple ][ saves the status register and program counter onto the stack, reads the vector at $FFFE-FFFF, and then starts executing from the specified address. However, there is also a special case IRQ, which is triggered by the BRK instruction.
This instruction is a single-byte breakpoint instruction, and is intended for debugging purposes. The ROM BIOS handler checks the source of the interrupt, and transfers control to the vector at $3FE-3FF if the source was an external interrupt. On the Autostart ROM, the ROM BIOS handler transfers control to the vector at $3F0-3F1 if the source was a breakpoint.59 The values at $3F0-3F1, and $3FE-3FF are always writable, allowing a program to protect itself against a user inserting breakpoints in order to break when execution reaches the specified address. The typical protected program respo
nse to breakpoints was to erase all of memory and then reboot. An alternative protection is to point $3F0-3F1 to another BRK instruction, to produce an infinite loop and hang the machine. Bank Street Writer III uses this technique.
On a 64kb Apple ][, the ROM BIOS can be copied into banked memory and made writable. The BIOS IRQ vector can then be changed directly, potentially bypassing the user-defined IRQ vector completely.
10:7.11 Catalog tricks
Control-"Break"
On a regular DOS disk, there is a sector called the Volume Table Of Contents (VTOC), which describes the starting location (track and sector) of the catalog, among other things. The catalog sectors contain the list on the disk of files which are accessible by DOS. For a file-based program, apart from the DOS and the catalog-related structures, all other content is accessible through the files listed in the catalog. DOS knows the track which holds the VTOC, since the track number (usually #$11) is hard-coded in DOS itself, and sector zero is assumed to be the one that holds the VTOC.
Since the files are listable, they can also be loaded from the original disk, and then saved to a copy of the disk. One way to prevent that is to insert control-characters in the filenames. Since control-characters are not visible from the DOS prompt, any attempt to load a file, using the name exactly as it appears, will fail.
PoC or GTFO, Volume 2 Page 17