Tetris-Assembly (x86 & x64)

๐ Update 25.01.2026 โ x64 Visual Enhancements
The x64 version received significant visual and UX improvements, leveraging modern Windows 11 APIs while maintaining the lightweight assembly approach:
| Feature | Description |
|---|---|
| Mica Backdrop Effect | Dark mode title bar with Windows 11 Mica material (DWMWA_USE_IMMERSIVE_DARK_MODE + DWMWA_SYSTEMBACKDROP_TYPE) for a sleek, modern appearance |
| Segoe UI Typography | All UI elements now use the Segoe UI font family with proper weight variations for improved readability |
| Green Player Field | Player name input field changes to light green (#E0FFE0) when text is entered, providing visual feedback that the name is set |
| Gold Line Clear Animation | Clearing lines triggers a smooth 300ms fade-out animation from gold (RGB 255,215,0) to black, replacing instant row removal |
| Modern Button Styling | Buttons use smaller, cleaner font styling consistent with Windows 11 design language |
| Resource Files | Added tetris.rc (resource script) and tetris.manifest (application manifest for DPI awareness and visual styles) |
Binary size impact: x64 binary increased from ~15 KB to ~18 KB (+20%) due to DWM API integration and animation code.
Note: These enhancements are exclusive to the x64 version. The x86 version remains unchanged at ~13 KB with classic Windows styling. The x64 implementation demonstrates more advanced Win32/DWM techniques due to the additional complexity already inherent in 64-bit assembly programming.
A high-performance, lightweight Tetris implementation written in pure Assembly (MASM) utilizing the Windows API. The project focuses on minimal binary footprint, efficient memory management, and direct OS integration without C Runtime (CRT) dependency.
Available in two architectures:
- x86 (32-bit): ~13 KB binary
- x64 (64-bit): ~18 KB binary
Subsystem: Windows (GUI)
๐ Quick Links
- Download Binaries: tetris.zip (v1.0.0) - Contains both x86 and x64 versions
- Source Repository: GitHub
๐๏ธ Architecture Comparison: x86 vs x64
Binary & Source Code Metrics
| Metric | x86 (32-bit) | x64 (64-bit) | Notes |
|---|---|---|---|
| Final Binary Size | ~13 KB | ~18 KB | +38% size increase due to 64-bit pointers, alignment, and DWM/Mica integration |
| Source Code Lines | ~2,400 LOC | ~2,600 LOC | +8% lines for manual calling convention management |
| Calling Convention | stdcall |
Microsoft x64 (fastcall) |
Fundamental architectural difference |
Key Technical Differences
1. Calling Convention Complexity
The x86 version benefits from the comfortable stdcall convention with MASM's invoke macro, which abstracts argument passing:
; x86: Simple and readable
invoke MessageBox, hWnd, addr szMessage, addr szTitle, MB_OK
The x64 version requires manual implementation of Microsoft's x64 calling convention (fastcall variant):
; x64: Manual register loading and stack management
mov rcx, hWnd ; 1st argument in RCX
lea rdx, szMessage ; 2nd argument in RDX
lea r8, szTitle ; 3rd argument in R8
mov r9d, MB_OK ; 4th argument in R9
sub rsp, 32 ; Shadow space (mandatory 32 bytes)
call MessageBox
add rsp, 32 ; Clean up shadow space
2. Shadow Space Requirement
x64 mandates 32 bytes (4ร8-byte slots) of "shadow space" on the stack for every function call, even if the function takes fewer than 4 parameters. This is a strict ABI requirement for Windows x64 and must be maintained even when not passing arguments via stack.
3. Stack Alignment
x64 requires 16-byte stack alignment (RSP & 0xF == 0) before call instructions. Misalignment causes crashes in many API functions (particularly graphics-related). This requires explicit alignment:
and rsp, -16 ; Align to 16-byte boundary
4. Register Usage
- x86: Arguments pushed on stack (right-to-left), return value in
EAX - x64: First 4 integer/pointer arguments in
RCX,RDX,R8,R9; additional arguments on stack; return value inRAX
5. Pointer Size Impact
All pointers and handles are 64-bit (8 bytes) in x64, affecting:
- Structure sizes and alignment
- Memory access patterns
- Address arithmetic
Development Challenges: x86 โ x64 Migration
The transition from x86 to x64 was a significant undertaking, primarily due to:
-
Loss of High-Level Abstractions: The comfortable
invokemacro in x86 (which auto-generates push sequences) is unavailable in x64. Every API call requires 5-7 lines of manual register/stack management. -
Shadow Space Management: Unlike x86's simple stack cleanup (
add esp, N), x64's shadow space requirement adds cognitive overhead to every function call. Forgetting to allocate or deallocate shadow space leads to stack corruption. -
Stack Alignment Debugging: Crashes due to misaligned stacks are notoriously difficult to debug. A single misalignment early in the call chain can cause failures deep in GDI/Win32 APIs, far from the actual error.
-
Increased Code Verbosity: Simple operations in x86 (1 line with
invoke) expand to 6+ lines in x64, reducing code readability and increasing maintenance burden. -
No
invokeSafety Net: The x86invokemacro performs type checking and automatic stack cleanup. x64 requires manual verification of argument types, counts, and calling conventions for every API.
Despite these challenges, the x64 version maintains identical functionality and visual behavior, demonstrating that low-level assembly can achieve platform parity with careful attention to ABI details.
๐ Technical Specifications & Features
1. Core Engine
- Zero-Dependency: No external libraries beyond standard Windows system DLLs (
user32,gdi32,kernel32,advapi32,shell32). - Memory Footprint: Highly optimized data structures. The entire game state is encapsulated in a single
GAME_STATEstructure. - 7-Bag Randomizer: Implements the modern Tetris Guideline "Random Generator" (7-bag) algorithm using Fisher-Yates shuffle. This ensures a uniform distribution of pieces and prevents long droughts of specific shapes by shuffling a "bag" of all 7 tetrominoes.
- Fixed Timestep: Game logic is driven by a high-frequency loop tuned for 60 FPS (~16ms delta), ensuring smooth input response and movement.
- SRS-inspired Rotation: Super Rotation System with wall kick tables for both standard pieces and I-piece, allowing rotation near walls and floors.
2. Graphics & Rendering
- GDI Double Buffering: Implementation of a backbuffer system using
CreateCompatibleDCandCreateCompatibleBitmapto eliminate flickering during high-frequency screen invalidation. - Ghost Piece Preview: Toggleable semi-transparent hatch pattern overlay showing the landing position of the current piece, rendered using
CreateHatchBrushwithHS_DIAGCROSSpattern. - Animated UI Elements: Pulsing "PAUSED" text with sine-wave brightness modulation (127-255 range) at 60 FPS for smooth visual feedback.
- Vector-like Tetromino Definition: Shapes are defined as coordinate offsets in
SHAPE_TEMPLATES, allowing for efficient rotation and collision calculations via iterative offset addition. - Dynamic UI: Integration of standard Win32 controls (Edit boxes, Buttons) with custom GDI-rendered game area.
- Color-Coded Interface: Next piece preview and record holder name displayed in matching piece colors for visual consistency.
3. Data Persistence (Registry-based)
Unlike traditional implementations using .ini or .cfg files, this project utilizes the Windows Registry for state persistence:
- Path:
HKEY_CURRENT_USER\Software\Tetris - Stored Keys:
PlayerName(REG_SZ / Unicode): Last active player identity.HighScore(REG_DWORD): Maximum score achieved.HighScoreName(REG_SZ / Unicode): Name of the record holder.
- Encoding: Full Unicode support for player names via
RegQueryValueExWandRegSetValueExW. - Clear Record Feature: One-click registry cleanup via Alt+C or dedicated button with confirmation dialog.
4. Collision & Logic
- AABB-style Collision: Piece-to-wall and piece-to-stack collision detection implemented through boundary checking and bitmask-like array lookups in the 10x20 board buffer.
- Line Clearing: Optimized scanline algorithm that identifies full rows and performs a memory-shift operation to drop the remaining blocks. Supports simultaneous multi-line clears.
- Progressive Difficulty: Gravity speed increases with level (every 10 lines cleared), calculated using fixed-point arithmetic with 1/10000 precision for smooth acceleration.
- Scoring System: Quadratic scaling (linesยฒ ร 100 ร level) rewards multi-line clears and higher levels.
5. User Experience
- Keyboard Shortcuts: Full accelerator table support (P/Alt+P, P/Alt+R, Alt+C) for pause, resume, and clear operations.
- Customizable Icon: Dynamic icon loading from
shell32.dllviaExtractIconAPI (configurable index). - Real-time Name Persistence: Player name auto-saves on text change via
EN_CHANGEnotification. - Anonymous Fallback: Automatically assigns "Anonymous" to high scores when no player name is set.
๐ Project Structure
The repository contains separate implementations for both architectures in dedicated directories:
Tetris_asm/
โโโ x86/ # 32-bit implementation
โ โโโ main.asm # Entry point, WndProc, message loop
โ โโโ game.asm # Core game logic
โ โโโ render.asm # GDI rendering engine
โ โโโ registry.asm # Registry persistence layer
โ โโโ data.inc # Structures and constants
โ โโโ proto.inc # Procedure prototypes
โโโ x64/ # 64-bit implementation
โ โโโ main.asm # Entry point (manual x64 calling convention)
โ โโโ game.asm # Core logic (64-bit registers)
โ โโโ render.asm # GDI rendering (shadow space management)
โ โโโ registry.asm # Registry operations (64-bit pointers)
โ โโโ data.inc # Structures (8-byte alignment)
โ โโโ proto.inc # Procedure prototypes (fastcall)
โ โโโ tetris.rc # Resource script (icon, manifest)
โ โโโ tetris.manifest # Application manifest (DPI, visual styles)
โโโ bin/ # Output directory (created by build script)
โ โโโ tetris.exe # x86 binary (~13 KB)
โ โโโ tetris64.exe # x64 binary (~18 KB)
โโโ build.ps1 # Unified PowerShell build script for both versions
Key Files
| File | Description |
|---|---|
main.asm |
Entry point, Window Procedure (WndProc), Message Loop, UI Control handling, and keyboard accelerators. |
game.asm |
Core logic: Tetromino movement, rotation with wall kicks, 7-bag generation using LCG RNG, collision detection, and line clearing. |
render.asm |
GDI rendering engine: Backbuffer management, block drawing, ghost piece rendering, pulsing text animation, and info panel output. |
registry.asm |
Low-level wrapper for advapi32 functions to handle persistent data storage (High Score, Player Name). |
data.inc |
Structure definitions (GAME_STATE, PIECE, RENDERER_STATE) and constant declarations. |
proto.inc |
Procedure prototypes for inter-modular communication. |
tetris.rc |
(x64 only) Resource script linking icon and application manifest. |
tetris.manifest |
(x64 only) Application manifest enabling DPI awareness, visual styles, and Windows 11 features. |
build.ps1 |
PowerShell script that builds both x86 and x64 versions using Visual Studio 2026 toolchain. |
๐ง Build Instructions
Prerequisites
- Visual Studio 2026 (or newer) with "Desktop development with C++" workload
- Includes MASM (
ml.exefor x86,ml64.exefor x64) - Includes Windows SDK with necessary libraries
- Includes MASM (
Note for older Visual Studio versions (2022 and below): If you're using VS 2022 or earlier, you'll need to adjust the build.ps1 PowerShell script to point to the correct Visual Studio installation path and toolchain version (lines 7-14).
Building with PowerShell (Recommended)
A single PowerShell script builds both x86 and x64 versions simultaneously:
.\build.ps1
The script will:
- Build x86 version from
x86/directory โbin/tetris.exe(~13 KB) - Build x64 version from
x64/directory โbin/tetris64.exe(~18 KB) - Automatically clean up object files
Output:
bin/
โโโ tetris.exe (x86, ~13 KB)
โโโ tetris64.exe (x64, ~18 KB)
Manual Build Process
x86 (32-bit) Compilation
Open "x86 Native Tools Command Prompt for VS 2026", navigate to the x86/ directory, and run:
cd x86
:: Assemble all modules
ml /c /Cp /Cx /Zd /Zf /Zi main.asm
ml /c /Cp /Cx /Zd /Zf /Zi game.asm
ml /c /Cp /Cx /Zd /Zf /Zi render.asm
ml /c /Cp /Cx /Zd /Zf /Zi registry.asm
:: Link objects into a standalone GUI executable
link main.obj game.obj render.obj registry.obj /subsystem:windows /entry:start /out:tetris.exe
Output: tetris.exe (~13 KB)
x64 (64-bit) Compilation
Open "x64 Native Tools Command Prompt for VS 2026", navigate to the x64/ directory, and run:
cd x64
:: Assemble all modules (64-bit)
ml64 /c /Cp /Cx /Zd /Zf /Zi main.asm
ml64 /c /Cp /Cx /Zd /Zf /Zi game.asm
ml64 /c /Cp /Cx /Zd /Zf /Zi render.asm
ml64 /c /Cp /Cx /Zd /Zf /Zi registry.asm
:: Link objects into a standalone 64-bit GUI executable
link main.obj game.obj render.obj registry.obj /subsystem:windows /entry:start /out:tetris64.exe
Output: tetris64.exe (~18 KB)
Technical Note: The x64 version requires significantly more manual code for API calls due to the lack of invoke macro support and mandatory shadow space allocation (32 bytes per call).
๐ฎ Controls
| Key | Action |
|---|---|
| Left / Right | Move Tetromino horizontally |
| Up | Rotate clockwise (with wall kicks) |
| Down | Soft Drop (faster fall) |
| Space | Hard Drop (instant placement) |
| P | Pause / Resume / Restart (on Game Over) |
| F2 | Start New Game |
| ESC | Exit Application |
| Alt+P | Pause Game |
| Alt+R | Resume Game |
| Alt+C | Clear High Score Record |
UI Controls
- Player Name Field: Auto-saves on change, supports Unicode input (max 127 characters).
- Pause/Resume Button: Context-sensitive label (changes based on game state).
- Clear Record Button: Resets high score to 0 with confirmation dialog.
- Ghost Toggle Button: Enable/disable landing position preview (ON/OFF).
๐จ Customization
Changing Application Icon
For x86 version - Edit main.asm around line 83-84:
invoke ExtractIcon, g_hInstance, offset szShell32, 19 ; Change icon index here
For x64 version - Edit main64.asm (manual calling convention):
mov rcx, g_hInstance
lea rdx, szShell32
mov r8d, 19 ; Change icon index here
sub rsp, 32
call ExtractIcon
add rsp, 32
Recommended DLL files for icons (Windows 11):
shell32.dll- Classic system iconsimageres.dll- Modern icon collection (300+ icons)ddores.dll- Hardware/device icons
Use Resource Hacker to browse available icons and their indices in these files.
๐ Technical Highlights
Random Number Generation
- Algorithm: Linear Congruential Generator (LCG)
- Formula:
seed = seed ร 1103515245 + 12345 - Seed Source: System tick count at initialization
- Distribution: Fisher-Yates shuffle ensures perfect fairness
Gravity System
- Fixed-Point Math: Accumulator with 1/10000 precision (
yFloat) - Speed Formula:
base_speed(300) + level ร 50 - Drop Trigger: Piece moves down when accumulator reaches 10000
Rendering Pipeline
- Clear backbuffer (dark gray 0x141414)
- Draw grid lines (0x323232)
- Draw locked blocks from board array
- Draw line clear animation overlay (x64: gold-to-black fade, 300ms)
- Draw ghost piece (hatch pattern, conditional)
- Draw current falling piece
- Draw next piece preview (color-matched)
- Draw statistics and controls guide
- Draw overlays (PAUSED pulsing text / GAME OVER)
- BitBlt backbuffer to screen (single operation, no flicker)
๐ก Which Version Should You Use?
- x86 (32-bit): Compatible with both 32-bit and 64-bit Windows systems. Smaller binary size (~13 KB). Classic Windows styling. Recommended for maximum compatibility.
- x64 (64-bit): Native 64-bit application with modern Windows 11 visual enhancements (Mica backdrop, line clear animations, green player field). Larger binary (~18 KB) but demonstrates advanced assembly techniques including DWM API integration.
Both versions provide the same core gameplay experience. The x64 version offers enhanced visuals on Windows 11.
Author: Marek Wesoลowski
Email: [email protected]
Website: https://kvc.pl
License: MIT