The 1.5MHz 6502 CPU used in Battlezone, Red Baron, and Tempest wasn't fast enough to perform the arithmetic required for a 3D game at a reasonable frame rate. To make the game possible, Atari engineers devised the "math box", a board with a 16-bit ALU constructed from four 4-bit AMD 2901 microprocessors running at 3MHz.
In October 2021, the original Battlezone source code was made available. The internal documentation for the math box can be found here. Prior to that, the best description of what the math box provides was the MAME implementation, which simulates the features without emulating the hardware and ROM. Some more human-readable explanations of the feature set can be found on the Retrocomputing Q&A site, but to really understand what it does you also need to see how it's used by a game. The math box is not a collection of general-purpose math functions, but rather an implementation of features required specifically for 3D games.
(If you want to see a first-person tank game implemented in 6502 assembly without additional hardware, see the Stellar 7 disassembly.)
(This is intended as a quick introduction for anyone not familiar.)
Performing arithmetic with floating-point numbers is too slow for use in a game on 8-bit hardware. Some games used 8.8 fixed-point numbers, where a 16-bit value has the integer part in the high byte and the fractional part in the low byte. While addition and subtraction work the same as they do for 16-bit integers, multiplication and division require minor adjustments.
Suppose we want to compute 9.5 x 3.75 = 35.625. In 8.8 fixed-point the inputs would be represented as $0980 and $03c0. Multiplying them together yields the 32-bit value $0023a000. To get the result we want, we need to shift it 8 bits to the right, to get the 16-bit value $23a0 ($23=35, $a0=160, 160/256=0.625, so it all checks out). Fixed-point arithmetic code will either right-shift to adjust the magnitude, or will construct its loops in a way that leaves the result properly positioned.
Division has complimentary behavior: the dividend must be shifted to the left before the operation is performed. To compute 9.5 / 3.75, we divide the 32-bit value $00098000 by $03c0 to get $0288 (2.53125).
If the inputs have placed the fixed point differently, e.g. we're multiplying by an 8.0 integer or a 0.16 fraction, the size of the pre- or post-shift must be adjusted accordingly.
Signed values require additional consideration. Suppose you multiply the 8.0 value +7 ($07) by the 0.8 fractional value +0.5 ($80). That yields the 8.8 value $0380, which you then shift right 8x to discard the fraction and form the 8.0 result +3 ($03). All good. Suppose you instead want to multiply by -0.5 ($c0). Signed multiplication works by sign-extending the fractional part before performing the multiplication (or by working with absolute values and tracking the sign). In this case we'd multiply $07 by $ffc0, yielding $fe40, which we would shift right 7x to get the integer ($fd = -3). We right-shift one fewer time because the sign bit doesn't count as a fractional bit. It's more accurate to describe a signed 8-bit fraction as a 1.7 value.
Objects in three-dimensional space have a position and an orientation. 3D rendering involves moving things around in 3D space and then projecting the result onto a 2D surface (the screen).
Jed Margolin's "Unit Vector Math for 3D Graphics" describes a left-handed coordinate system in which +X points into the monitor, +Y to the right, and +Z up. The game itself does seem to use a left-handed system, but rotated 90 degrees: when the player is facing angle 0, moving forward increases Y. Turning slightly left increases the angle, and moving forward begins to increase X as well.
The Battlezone sources apply the coordinate system inconsistently. For example, the logo that slides off into the distance is in three parts, "Ba", "ttle", and "Zone". "Ba" is above and to the left of the center point and tilted into the screen, so all values should be +X/-Y/+Z, but if you look at the vertex list you'll note that all of its coordinates are positive. It's necessary to flip the object coordinates to make it come out right. (The vertex transformation code appears to do this.)
For the sake of sanity it's useful to apply a more common set of labels to the various axes: +X to the right, +Y up, +Z into the monitor. The trouble with doing so is that, as the viewer sees it, +X is to the left. Except, as it turns out, it isn't: the vertex transformation code negates the screen X coordinate, essentially mirror-imaging the screen. So +X is to the right where it should be when most of the math is being performed, but ends up on the left at the end. Which works right for some things, but reverses the direction of rotation.
The labeling used in the disassembly was chosen because it makes the math in the game work with the fewest mental remappings: +X to the left, +Y up, +Z into the monitor. When facing angle 0 you're looking toward +Z; increasing the angle rotates counter-clockwise toward +X. It looks like this:
Happily, the 2D coordinate system used for the monitor is conventional Cartesian: center at (0,0), +X/+Y to the right and up.
A simple way to think about the math box is as an opaque device with 32 8-bit inputs and three 8-bit outputs. When the 6502 writes a value to one of the inputs, the math box takes the byte of data and executes a function. Some of the functions simply save the data, some perform complex operations. All functions write a 16-bit value to the output, setting and clearing a "device busy" flag byte so the 6502 can tell when the computation has finished.
The 6502 does not need to wait for the "busy" flag to clear after every operation so long as the upper bound of the execution time is known. In some cases the Battlezone code simply delays for a few cycles before reading the results. This must be done carefully, as starting a new operation will halt the previous operation mid-calcuation (as suggested by the documentation here, and confirmed by Matthew Hagerty from the hardware schematics).
The device has sixteen 16-bit "registers" that retain state between calls, as well as a bit of temporary storage for computation. These are not literally CPU registers, but since we're treating the device as opaque it's okay to think of them that way. For convenience we will refer to them in hex notation, as R0 through RF. The values in the first 12 can be set directly. The last four are only used internally.
Each function is listed with its index, the address label used in the disassembly, and a brief description of the function it performs.
Index | Label | Function (approximate) |
---|---|---|
$00 | MB_SET_R0L | Set R0 low; result=R0 |
$01 | MB_SET_R0H | Set R0 high; result=R0 |
$02 | MB_SET_R1L | Set R1 low; result=R1 |
$03 | MB_SET_R1H | Set R1 high; result=R1 |
$04 | MB_SET_R2L | Set R2 low; result=R2 |
$05 | MB_SET_R2H | Set R2 high; result=R2 |
$06 | MB_SET_R3L | Set R3 low; result=R3 |
$07 | MB_SET_R3H | Set R3 high; result=R3 |
$08 | MB_SET_R4L | Set R4 low; result=R4 |
$09 | MB_SET_R4H | Set R4 high; result=R4 |
$0a | MB_SET_R5L | Set R5 low; result=R5 |
$0b | MB_ROT_Z | Set R5 high; R4 = R4 - R2 R5 = R5 - R3 result = (R0 * R4) - (R1 * R5) |
$0c | MB_SET_R6 | Set R6; result=R6 |
$0d | MB_SET_RAL | Set RA low; result=RA |
$0e | MB_SET_RAH | Set RA high; result=RA |
$0f | MB_SET_RBL | Set RB low; result=RB |
$10 | MB_SET_RBH | Set RB high; result=RB |
$11 | MB_SCREEN_X | Set R5 high; R7 = (R0 * R4) - (R1 * R5) + R2 R8 = (R1 * R4) + (R0 * R5) + R3 result = R8 / R7 |
$12 | MB_ROT_X | Data ignored; when called after $0b: result = (R1 * R4) + (R0 * R5) |
$13 | MB_DIVIDE_87 | Data ignored; result=R8 / R7 |
$14 | MB_DIVIDE_B7 | Data ignored; result=RB / R7 |
$15 | MB_SET_R7L | Set R7 low; result=R7 |
$16 | MB_SET_R7H | Set R7 high; result=R7 |
$17 | MB_GET_R7 | Data ignored; result=R7 |
$18 | MB_GET_R9 | Data ignored; result=R9 |
$19 | MB_GET_R8 | Data ignored; result=R8 |
$1a | MB_SET_R8L | Set R8 low; result=R8 |
$1b | MB_SET_R8H | Set R8 high; result=R8 |
$1c | MB_CLIP | Set R5 high; result=midpoint subdivision |
$1d | MB_CALC_DIST | Set R3 high; R2=abs(R2-R0), R3=abs(R3-R1) continue into $1e |
$1e | MB_CALC_HYPOT | Data ignored; result=approximation of sqrt(R2^2 + R3^2) |
$1f | MB_UNKNOWN | Behavior unknown |
Battlezone invokes all functions except $13, $18, $1a, $1b, $1c, and $1f (marked in italics). Function $13 is used as part of $11 but not invoked directly. Function $1c appears to have been intended for a line clipping function that used midpoint subdivision, but the vector hardware performs that function. The system diagnostics code has a routine that rapidly invokes every function in sequence, but it doesn't check the results.
The game uses function $0c to set R6 to $0a once each frame. Note there's no way to set the high byte of R6, but since it's only used to specify a small integer count for the division routine there's really no need.
Atari's internal documentation lists functions $1d-1f as unused. The math box source code itself is here, which notes that the clip function ($1c) was added for Malibu Grand Prix by Ed Logg, and the distance function ($1d/1e) was added for Battlezone by Ed Rotberg.
Battlezone uses the math box to do five things:
Before we go into the details, it's worth taking a minute to review how 3D graphics work.
In the classic implementation you take an object, rotate it, and translate all of its vertices to the correct position and orientation in the world (the Model transform). Then you rotate and translate it so it's correctly positioned relative to the viewer (the View transform). Finally you project the coordinates into 2D "clip space", and apply a Viewport transform to put it in the right place on the screen.
Battlezone rearranges this a bit. It performs the View translation and rotation first, so that it can detect when objects are completely off-screen and exclude them from further consideration. It generates a list of visible objects, with four values for each: object type, facing, object center X position, and object center Z position. These values are fed into the vertex transformation function, which performs the Model transform and projects the vertex positions onto screen X/Y coordinates. Clipping at window edges is performed by the hardware.
Battlezone's objects can rise above ground level but can only rotate about the Y axis, so for the Model and View transforms we need to apply a single-axis rotation and translation along all axes.
Object positions are tracked in world space with 16-bit X/Z values that wrap around at the map edges. The object cull "near" plane is $3ff and the "far" plane is $7aff, meaning objects positioned closer or farther than that are not considered visible. Note the far plane is nearly half the width of the battlefield. You can tell that the game uses a clip plane rather than a clip distance because, if you back up until an object in the center of the viewer just disappears, rotating to the left or right will make it reappear.
The formula for rotation about the Y axis is:
newX = X * cos(theta) - Z * sin(theta) newZ = X * sin(theta) + Z * cos(theta)
Battlezone uses a table of sin/cos values stored as 16-bit signed fixed-point fractions. After multiplication, the result must be right-shifted 15x, because the fractional part is only 15 of the 16 bits. The math box code right-shifts it 16x, effectively halving the result. (This is compensated for in the code or in the data.)
The game must take the center coordinates of the various objects (enemy unit or flying chunks, friendly and enemy projectiles and projectile explosions, the saucer, obstacles), and compute a new position that is relative to the position and facing of the player. Once an object's position is known, it can be evaluated to see if it's currently in the player's field of view. If it's behind the viewer, beyond the far plane, or too far to the left or right, we can save time by not transforming the object's vertices.
The position calculation is straightforward. Starting with the object's position in "world space" coordinates, we subtract the viewer's position to get the object's position relative to the viewer:
rel_X = obj_world_X - viewer_X rel_Z = obj_world_Z - viewer_Z
Then just plug the coordinates into the rotation formula, using the viewer's facing angle as theta, to get the "view space" coordinates:
view_X = rel_X * cos(theta) - rel_Z * sin(theta) view_Z = rel_X * sin(theta) + rel_Z * cos(theta)
The mathbox implementation is a direct implementation, with a bit of confusion about how the axes are defined. The 6502 code begins by setting up registers for the viewer:
R0 = cos(theta) as signed 1.15 fraction R1 = -sin(theta) as signed 1.15 fraction R2 = 16-bit viewer_Z R3 = 16-bit viewer_X
Note that R1 is set to negative sin(theta). For each object under consideration:
R4 = 16-bit obj_world_Z R5 = 16-bit obj_world_X invoke function $0b: R4 -= R2 # rel_Z = obj_world_Z - viewer_Z R5 -= R3 # rel_X = obj_world_X - viewer_X RC = R4 * R0 R7 = R5 * -R1 result = R7 + RC # view_Z = rel_X * sin(theta) + rel_Z * cos(theta) invoke function $12: RC = R4 * R1 R8 = R5 * R0 result = R8 + RC # view_X = rel_X * cos(theta) - rel_Z * sin(theta)
The results are doubled to compensate for the 16-bit shift.
Once we have the position, we cull any shapes that are outside the view frustum. Objects whose center points are closer than the near plane ($03ff) or farther than the far plane ($7aff) are dropped. The left/right frustum bounds clipping is done by comparing the absolute value of the X coordinate to the Z coordinate, and culling any object where abs(X) > abs(Z). This is a very fast clip for a 90-degree FOV. Since the game uses a 45 degree FOV it's a little loose. The clip is based on the object center points, so it needs to be a little loose or things would vanish when they're still halfway on screen. The frustum bounds check is simple enough that the math box doesn't need to get involved.
Side note: saucers don't make noises when they're not visible, but the definition of visibility is the 90-degree FOV, so they're still audible when off screen to the left or right.
The code projects the X/Y/Z coordinates of an object to X/Y coordinates on screen. The inputs to the function are the four values output from the View transform: type, facing, X position, Z position. We need to apply a transformation that rotates and translates the object's individual vertices (Model transform), then divides X and Y by Z to perform perspective projection.
So, given the object's "view space" X/Y/Z and vertex X/Y/Z, we need to compute "screen space" X/Y, like this:
rotated_Z = vertex_X * sin(theta) + vertex_Z * cos(theta) model_Z = rotated_Z + object_Z rotated_X = vertex_X * cos(theta) - vertex_Z * sin(theta) model_X = rotated_X + object_X screen_X = model_X / model_Z model_Y = vertex_Y + object_Y screen_Y = model_Y / model_Z
Here, the angle theta is the object's facing relative to the viewer.
The math box functions again implement this directly. For each object, the 6502 code sets math box registers for the facing angle and center position:
theta = player facing - object facing theta += 180 degrees R0 = -cos(theta) as signed 1.15 fraction R1 = sin(theta) as signed 1.15 fraction R2 = 16-bit object_Z R3 = 16-bit object_X RA = 0
Note R0 is set to negative cos(theta), and the object is rotated 180 degrees. Then, for each vertex:
R4 = 16-bit vertex_Z R5 = 16-bit vertex_X invoke function $11: RC = R4 * R0 # RC = vertex_Z * -cos(theta) R7 = R5 * -R1 # R7 = vertex_X * -sin(theta) R7 += RC # rotated_Z = -(vertex_X * sin(theta) + vertex_Z * cos(theta)) R7 += R2 # model_Z = rotated_Z + object_Z (continue into function $12) RC = R4 * R1 # RC = vertex_Z * sin(theta) R8 = R5 * R0 # R8 = vertex_X * -cos(theta) R8 += RC # rotated_X = -(vertex_X * cos(theta) - vertex_Z * sin(theta)) R8 += R3 # model_X = rotated_X + object_X (continue into function $13) result = R8 / R7 # model_X / model_Z screen_X = result RB = vertexY + object_Y invoke function $14: result = RB / R7 # model_Y / model_Z screen_Y = result
Note the use of -cos(theta) negates the rotated X/Z values. A couple of other details that balance out some of the odd math:
object_Y
coordinate is modified by the horizon
shift that happens when the player is hit or drives into something.screen_X
result is negated, effectively mirror-imaging
the screen but only for the 3D-rendered elements (i.e. not the background
or messages).The description of the math box implementation has been simplified greatly. If you want to dig into it, the MAME source code is dense but fairly brief. There are a couple of finer points that should be called out here though:
The API seems to be designed to minimize the interactions between the 6502 and the math box. For example, the 6502 code sets the object's X/Z position once, then supplies the X/Z coordinates for the vertices, and lets the mathbox add the object position to the vertex position. For the Y coordinate, the same calculation is performed on the 6502, presumably because it's cheap and doesn't come after a rotation.
The position of the blip on the radar requires essentially the same calculation as the model transform (translate then rotate), but without the perspective projection. The easiest way to do that is to invoke function $11 and ignore the result. The center of the radar is at screen coordinate (0,316), which we can drop into the equations.
The code looks like this:
R0 = sin(theta) R1 = cos(theta) R2 = 0 R3 = 316 R4 = distance # computed distance / 256 R5 = 0 invoke function $11: RC = R4 * R0 R7 = R5 * -R1 R7 += RC # R7 = distance * sin(theta) R7 += R2 (continue into function $12) RC = R4 * R1 R8 = R5 * R0 R8 += RC # R8 = distance * cos(theta) R8 += R3 # R8 += 316 (continue into function $13) result = R8 / R7
R7 is used as the X coordinate, R8 as the Y coordinate.
An article on distance approximations describes "a fast approximation of 2D distance based on an octagonal boundary", with the formula:
distance = 0.41 * dx + 0.941246 * dy (for dy > dx; flip the values if dy < dx)
Math box function $1e uses an approximation to the approximation:
distance = 0.375 * dx + 1.0 * dy
0.375 is 3/8, so this is trivial to compute with a few adds and shifts.
The math box provides two entry points. Function $1d subtracts two sets of 16-bit coordinate values and stores their absolute values in R2/R3, then falls into function $1e. If you view these as the lengths of the short sides of a right triangle, then function $1e is calculating the length of the hypotenuse.
The distance between two objects is used to determine whether or not they have collided. Collisions between units (e.g. a tank driving into an obstacle) use different values than collisions with a projectile, which is why you can shoot past obstacles that you can't drive past. Projectile collisions also take unit facing into account.
Suppose you have the positions of the enemy unit and the player, and want to compute the heading to which the enemy unit must rotate so it can fire its cannon and hit the player. We can treat this as a basic trigonometry problem by drawing a right triangle with the units at opposite corners of the hypotenuse. The lengths of the other sides are simply the difference between the unit X coordinates and Z coordinates, respectively.
Given a right triangle, you can compute the (non-90-degree) angle with arccos(), arcsin(), or arctan(). arccos() and arcsin() require knowing the length of the hypotenuse, so arctan() is the function of choice. You divide the length of one side by the other side and plug the result into the arctan() function.
Battlezone uses math box function $14 to perform integer division, then uses a lookup table to get the angle.
Copyright 2020 by Andy McFadden