Notepad: Win32 API Implementation (x86/x64)

Abstract
Notepad is a bare-metal implementation of a text editor for the Microsoft Windows operating system, written entirely in Macro Assembler (MASM). Unlike standard software development involving high-level abstractions (C#, C++, Python), this project interacts directly with the Win32 API and the CPU registers, bypassing the C Runtime (CRT) entirely.
This repository serves as a reference implementation for systems programmers, malware analysts, and computer science students studying the PE (Portable Executable) format, Windows message loops, and low-level memory management. It demonstrates the dichotomy between legacy x86 (Flat Memory Model) and modern x64 (Microsoft x64 ABI) calling conventions within a single codebase.
Download
Pre-built binaries: notepad.zip
Contains both x86 and x64 executables ready to run on Windows.
Technical Specifications
Build Environment
| Component | Specification | Notes |
|---|---|---|
| Assembler | ml.exe (x86) / ml64.exe (x64) | Microsoft Macro Assembler |
| Linker | link.exe | Microsoft Incremental Linker |
| Subsystem | WINDOWS | Graphical User Interface (GUI) |
| Entry Point | start | Custom entry, no main() wrapper |
| Resource Compiler | rc.exe | Compiles menus, icons, and manifests |
Core Dependencies (Kernel-Level)
The application relies strictly on standard dynamic link libraries found in all Windows versions since XP:
- kernel32.dll: Memory allocation (HeapAlloc/HeapFree), File I/O (CreateFile, ReadFile, WriteFile), Process control
- user32.dll: Window creation (CreateWindowEx), Message loop (GetMessage), Clipboard interaction
- gdi32.dll: Font rendering and graphics device interface contexts
- comdlg32.dll: Common Dialogs (Open File, Save File, Print, Find/Replace)
- shell32.dll: Shell functions and file path operations
- shlwapi.dll: Shell Lightweight API (PathFindFileName for title display)
- comctl32.dll: Common controls (Status Bar)
- riched20.dll: RichEdit 2.0 control for advanced text editing
Architecture & Internals
The application implements a standard Windows Event-Driven Architecture. It does not poll for input; rather, it yields CPU time until the Operating System pushes a message to the thread's message queue.
1. The Message Loop (The Heartbeat)
The entry point initializes the WNDCLASSEX structure and spawns the main window. It then enters an infinite loop, consuming approximately 0% CPU when idle.
; Pseudo-assembly representation of the core loop (x64)
MessageLoop:
mov rcx, OFFSET msg
xor rdx, rdx
xor r8, r8
xor r9, r9
call GetMessage ; Blocking call, waits for OS event
test eax, eax
jz ExitProgram ; WM_QUIT received
; Modeless Dialog Handling (Find/Replace)
mov rcx, hFindReplaceDlg
mov rdx, OFFSET msg
call IsDialogMessage ; Checks if msg belongs to Find/Replace dialog
test eax, eax
jnz MessageLoop ; If handled, skip Dispatch
call TranslateMessage ; Virtual-Key -> character
call DispatchMessage ; Route to WndProc
jmp MessageLoop
2. Dual-Architecture Logic (x86 vs x64)
The codebase highlights critical differences in assembly programming between 32-bit and 64-bit modes.
x86 (32-bit Protected Mode)
- Calling Convention: STDCALL. Arguments are pushed onto the stack in reverse order. The callee cleans the stack (
ret n). - Registers: EAX, EBX, ECX, EDX, ESI, EDI, EBP, ESP.
- Memory Addressing: 32-bit absolute or relative.
- MASM Syntax: Uses
invokemacro for simplified API calls.
x64 (Long Mode)
- Calling Convention: Microsoft x64 ABI (FASTCALL variant).
- First 4 integer arguments passed in RCX, RDX, R8, R9.
- Floating point args in XMM0 - XMM3.
- Remaining arguments pushed to stack.
- Shadow Space: The caller must reserve 32 bytes (0x20) on the stack for the callee to spill registers.
- Stack Alignment: The stack pointer (RSP) must be aligned to a 16-byte boundary before calling any Windows API function.
- RIP-Relative Addressing: Data is accessed relative to the current instruction pointer to support position-independent code (PIC).
- Handles: All handles and pointers are 64-bit (QWORD).
3. Memory Management Implementation
Since malloc and free (C-Runtime) are unavailable, the application interfaces directly with the Windows Heap Manager via kernel32:
- Allocation:
HeapAlloc(GetProcessHeap(), HEAP_ZERO_MEMORY, size) - Deallocation:
HeapFree(hHeap, 0, pMemory)
This is critically used for:
- File buffers when reading/writing files
- Text buffers for status bar updates and word counting
- Temporary storage during word wrap toggle
4. Unicode Support
The application uses Unicode (UTF-16 LE) throughout:
- All Windows API calls use Wide (W) variants:
CreateWindowExW,SendMessageW, etc. - File Reading: Detects encoding via BOM (Byte Order Mark):
- UTF-16 LE (FF FE): Direct load
- UTF-8 (EF BB BF): Convert via
MultiByteToWideChar - No BOM: Try UTF-8 first, fallback to ANSI (CP_ACP)
- File Writing: Always UTF-16 LE with BOM for maximum compatibility
Feature Implementation Detail
A. The RichEdit Control
Instead of using a basic EDIT control, the application uses RichEdit 2.0 (riched20.dll) which provides:
- Advanced text selection and manipulation
- Built-in Find/Replace support via
EM_FINDTEXTEX - Character formatting capabilities
- Better undo/redo handling
Styles: WS_CHILD | WS_VISIBLE | WS_VSCROLL | ES_MULTILINE | ES_AUTOVSCROLL | ES_NOHIDESEL
For word wrap toggle, horizontal scrolling is added/removed: WS_HSCROLL | ES_AUTOHSCROLL
B. File I/O Pipeline
File operations adhere to strict transactional steps to ensure data integrity:
- CreateFile: Opens handle with
GENERIC_READorGENERIC_WRITE - GetFileSize: Determines allocation requirements
- Heap Allocation: Dynamic memory request via
HeapAlloc - ReadFile / WriteFile: Bulk transfer between disk and memory
- Encoding Conversion: BOM detection and
MultiByteToWideCharif needed - SetWindowText / GetWindowText: Transfer between memory and the GUI RichEdit control
C. Find & Replace
The search feature uses the Common Dialog Box Library (FindText / ReplaceText) for the UI, with search logic implemented via RichEdit messages:
- Search:
EM_FINDTEXTEXwithFINDTEXTEXstructure - Selection:
EM_EXSETSELto highlight matching text - Replace:
EM_REPLACESELfor text substitution - Wrap Around: Automatic search restart from beginning/end when not found
D. Status Bar
Real-time display of:
- Current cursor position:
Ln X, Col Y - Word count (manual counting algorithm)
- Character count (excluding CR)
- Line count
Keyboard Shortcuts
| Shortcut | Action |
|---|---|
| Ctrl+N | New document |
| Ctrl+O | Open file |
| Ctrl+S | Save file |
| Ctrl+Shift+S | Save As |
| Ctrl+P | |
| Ctrl+Z | Undo |
| Ctrl+X | Cut |
| Ctrl+C | Copy |
| Ctrl+V | Paste |
| Ctrl+A | Select All |
| Ctrl+F | Find |
| Ctrl+H | Replace |
| F3 | Find Next |
| Shift+F3 | Find Previous |
| Del | Delete selection |
Performance & Metrics
| Metric | Notepad ASM (x64) | Notepad ASM (x86) | MS Notepad (Win11) | VS Code |
|---|---|---|---|---|
| Disk Usage | ~20 KB | ~18 KB | ~200 KB + Deps | ~300 MB |
| RAM Usage (Idle) | ~1.5 MB | ~1.2 MB | ~12 MB | ~400 MB |
| Startup Time | < 10ms | < 10ms | ~200ms | ~2500ms |
| Dependencies | System DLLs only | System DLLs only | UWP / CRT | Electron / Node.js |
Note: The tiny memory footprint is due to the lack of garbage collection, JIT compilation, or interpreted runtime environments. The application maps directly to OS pages.
Build Instructions
The project includes a PowerShell build script (build.ps1) that automates the assembly and linking process.
Prerequisites
- Visual Studio Build Tools (Workload: C++ Desktop Development)
- Windows SDK (for rc.exe and libraries)
- PATH must include paths to ml.exe, ml64.exe, rc.exe, and link.exe
Compilation Steps
-
Clone the repository:
git clone https://github.com/wesmar/notepad.git cd notepad -
Run the Build Script:
.\build.ps1The script will:
- Compile resources (.rc -> .res)
- Assemble source files (.asm -> .obj)
- Link object files with libraries into executables
- Move binaries to
bin/folder - Clean up intermediate files
-
Manual Compilation (x64 Example):
cd x64 rc /c65001 notepad.rc ml64 /c /Cp /Cx /Zd /Zf /Zi main.asm ml64 /c /Cp /Cx /Zd /Zf /Zi file.asm ml64 /c /Cp /Cx /Zd /Zf /Zi edit.asm link main.obj file.obj edit.obj notepad.res /subsystem:windows /entry:start /out:Notepad_x64.exe /MANIFEST:EMBED /MANIFESTINPUT:notepad.manifest kernel32.lib user32.lib gdi32.lib comdlg32.lib shell32.lib shlwapi.lib comctl32.lib
Scientific & Academic Use Cases
This project is not merely a tool, but a pedagogical instrument for:
-
Reverse Engineering Training:
- Analyzing the generated binary in IDA Pro or Ghidra provides a clean "control group" for recognizing standard Win32 patterns without compiler optimization noise
- Perfect for learning to identify prologue and epilogue sequences manually
-
Malware Analysis Research:
- Many malware families use raw API calls to avoid detection by heuristics that look for CRT signatures
- Understanding how to invoke APIs like CreateFile and HeapAlloc in pure assembly is crucial for analysts
-
Operating Systems Study:
- Demonstrates the boundary between User Mode (Ring 3) application logic and Kernel Mode (Ring 0) transitions via system calls (mediated by ntdll.dll / kernel32.dll)
Directory Structure
notepad/
├── bin/ # Compiled executables
│ ├── Notepad_x86.exe # 32-bit executable (~18 KB)
│ └── Notepad_x64.exe # 64-bit executable (~20 KB)
├── x86/ # 32-bit source files
│ ├── main.asm # Entry point, WinMain, WndProc
│ ├── file.asm # File operations (New, Open, Save, Print)
│ ├── edit.asm # Edit functions (Find, Replace, Status Bar)
│ ├── data.inc # Data structures, constants, variables
│ ├── proto.inc # Function prototypes, API declarations
│ ├── notepad.rc # Resource script (manifest reference)
│ └── notepad.manifest # Application manifest (DPI awareness, etc.)
├── x64/ # 64-bit source files
│ ├── main.asm # Entry point, WinMain, WndProc (x64 ABI)
│ ├── file.asm # File operations (x64 calling convention)
│ ├── edit.asm # Edit functions (x64 calling convention)
│ ├── data.inc # Data structures (64-bit handles, alignment)
│ ├── proto.inc # Function prototypes (EXTERN declarations)
│ ├── notepad.rc # Resource script
│ └── notepad.manifest # Application manifest
├── build.ps1 # Automated build pipeline
├── LICENSE.md # MIT License
└── README.md # Documentation
Known Limitations
- Large File Handling: The implementation loads the entire file into RAM. Files larger than available heap space will trigger an allocation failure.
- Undo/Redo: Relies on the RichEdit control's built-in undo buffer. Complex multi-level undo history is not manually implemented.
- Print: Basic single-page print implementation. Does not support pagination or print preview.
License
MIT License. Free for academic, personal, and commercial use. Attribution to the original author is appreciated but not mandatory.
Author
Marek Wesolowski
- Email: [email protected]
- Website: https://kvc.pl
- Tel/WhatsApp: +48 607 440 283
Project Repository: https://github.com/wesmar/notepad