5.0 KiB
CLAUDE.md
This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
Project Overview
w3m is a text-based web browser and pager for terminal environments. It's a C codebase originally developed by Akinori Ito and currently maintained as a Debian fork. The browser can display HTML documents in text mode, follow links, handle forms, and display images using external viewers.
Build System
This project uses autotools (autoconf/automake) with a traditional Makefile-based build system:
# Configure the build (generates Makefile from Makefile.in)
./configure
# Build the project
make
# Install (requires root privileges)
make install
# Clean build artifacts
make clean
Dependencies
- GC library (version 6.1 or later): Boehm garbage collector for memory management
- Standard C development tools: gcc/clang, make, autoconf
- Optional: Various image libraries for image display support
Testing
Basic regression tests are available:
# Run the test suite
cd tests
./run_tests
The test suite compares w3m HTML rendering output against expected results for various HTML test cases.
Code Architecture
Core Components
- main.c: Entry point and main event loop
- fm.h: Central header file containing core data structures and definitions
- buffer.c: Buffer management for document content and display
- display.c: Terminal display and rendering logic with color support
- html.c: HTML parsing and tag processing
- file.c: File and URL handling, protocol support (HTTP, FTP, etc.)
- form.c: HTML form processing and interaction
- table.c: HTML table rendering and layout
- frame.c: HTML frame support
Key Data Structures
- Buffer: Central data structure for document content, defined in fm.h
- TabBuffer: Tab management for multiple documents
- BufferPos: Position tracking within documents
Character Encoding Support
The libwc/
directory contains comprehensive character encoding support:
- Multi-byte character handling (UTF-8, EUC-JP, Big5, etc.)
- Character set detection and conversion
- Wide character support for international text
Image Support
The w3mimg/
directory provides image display capabilities:
fb/
: Framebuffer image displayx11/
: X11 image displaywin/
: Windows image display
Internationalization
po/
: Translation files for multiple languages (German, Japanese, Chinese, etc.)- Multi-language documentation in
doc/
,doc-jp/
,doc-de/
Configuration
- Configuration is handled through autoconf-generated
config.h
- Runtime configuration via
rc.c
and various RC files - Menu and keymap configurations in
doc/
directories
Development Notes
- The codebase uses the Boehm GC for memory management
- Heavy use of custom string handling via
Str.c
/Str.h
- Terminal capabilities handled through
terms.c
- Mouse support available through GPM and other terminal mouse protocols
Screen Reader Navigation Features
Screen reader-style navigation commands have been successfully implemented to improve accessibility:
New Navigation Commands:
d
- Move to next heading (NEXT_HEADING)e
- Move to previous heading (PREV_HEADING)f
- Move to next form element (NEXT_FORM)p
- Move to previous form element (PREV_FORM)
Implementation Details:
- Heading navigation: Uses intelligent heuristic text analysis to identify actual headings while filtering out paragraphs, links, and other non-heading content
- Form navigation: Leverages existing
formitem
anchor system for reliable form element traversal - Functions: Implemented in main.c as
_nextHeading()
,_prevHeading()
,_nextForm()
,_prevForm()
- Key bindings: Integrated into hardcoded keymap array in keybind.c for reliable key processing
- Status: ✅ WORKING - Heading navigation fully functional, form navigation ready for testing
Build Notes
Modernization Status
The codebase has been partially modernized to compile with modern GCC versions:
✅ FIXED:
- Signal handler type compatibility issues in main.c, terms.c, istream.c
- Function pointer type issues in parsetagx.c (function dispatch table)
- Input keymap function call issues in linein.c
- GPM mouse library compatibility (Gpm_Wgetch vs Gpm_Getch)
⚠️ REMAINING ISSUES:
- Function pointer compatibility in libwc/ (character encoding library)
- Various other function pointer signature mismatches throughout codebase
BUILD COMMAND:
make WARNINGS="-Wall -Wnull-dereference -Wno-incompatible-pointer-types -Wno-pointer-sign"
The core w3m functionality including the new screen reader navigation compiles successfully. The remaining issues are in the character encoding subsystem and would require systematic function pointer signature updates throughout libwc/.
Security Considerations
Recent security fixes have addressed buffer overflow vulnerabilities (CVE-2023-38252, CVE-2023-38253). When modifying string handling or buffer operations, pay careful attention to bounds checking.