This commit completely overhauls the heading navigation feature (d/shift+D keys)
to use actual HTML tag information instead of unreliable text-based heuristics.
Key improvements:
- Navigation now works 100% reliably by tracking actual <h1>-<h6> HTML tags
- Eliminates false positives from bold text, links, and buttons
- No longer navigates to blank lines around headings
- Provides true screen reader-style heading navigation
Technical implementation:
- Added LINE_FLAG_HEADING flag to mark heading lines during HTML processing
- Enhanced readbuffer with in_heading field to track heading tag state
- Modified HTML parser to set/clear heading flags on <h>/<\/h> tags
- Updated TextLine and Line structures to preserve heading information
- Simplified navigation functions to use reliable flag-based detection
- Added content length check to avoid marking blank spacing lines
Also includes compilation fixes for modern GCC:
- Fixed function pointer type compatibility issues
- Updated signal handler declarations
- Resolved deprecation warnings for various system calls
🤖 Generated with [Claude Code](https://claude.ai/code)
Co-Authored-By: Claude <noreply@anthropic.com>
This fixes issue #199 reported by Kuang-che Wu.
A specially crafted Gopher URL (e.g. '<a href=gopher:R>') could lead to
an out-of-bounds read.
Problem here was, that 'p' was incremented twice without checking for
the end of the string.
The interesting question for me is: What does this 'if' actually check?
What is special here about the 'R'? I did not find anything related in
RFC 1436 or in RFC 4266.
Since Google gives usable search results to Lynx but not to w3m, and
many other sites block Lynx but /not/ w3m, we want to be able to set
the User Agent string on a per-site basis.
Adding on command line the user agent add a duplicate header:
```
./w3m -header "User-Agent: Mozilla" http://localhost:9999
GET / HTTP/1.0
User-Agent: w3m/0.5.3+git20190105
Accept: text/html, text/*;q=0.5, image/*, application/*, message/*, x-scheme-handler/*, audio/*, video/*, inode/*
Accept-Encoding: gzip, compress, bzip, bzip2, deflate
Accept-Language: en;q=1.0
Host: localhost:9999
Pragma: no-cache
Cache-control: no-cache
User-Agent: Mozilla
```
As a result most server will take the first given; the default
w3m_version or the one defined on config `user_agent`
With this patch we can now override `User-Agent` from command line
Due to the "CRIME attack" (CVE-2012-4929) HTTPS clients
that negotiate TLS-level compression can be abused for
MITM attacks.
Patch from openSUSE on 2012-11-12:
https://build.opensuse.org/request/show/141054