Readplace

How Michael Abrash doubled Quake framerate

fabiensanglard.net 13 min read
View original
Summary (TL;DR)
Hand-written assembly in Quake doubled its framerate from 22.7 to 42.2 fps on a Pentium MMX 233MHz. The key optimizations were D_DrawSpans8 (providing 12.6 fps gain), R_DrawSurfaceBlock8_mip* (4.2 fps), and D_Polyset* (2.2 fps). Techniques included loop unrolling, self-modifying code to avoid registers, overlapping FDIV with integer work, using jump tables to prevent mispredictions, and exploiting the Pentium's dual pipelines and free FXCH instruction to hide FPU latency. TransformVector, for example, computed three dot products in parallel rather than serially, avoiding stalls.