Commit Graph

11 Commits

Author SHA1 Message Date
Rene Kita
e8287f36b0 Skip soft hyphen when reading token
The soft hyphen should only appear if a word is broken at the hyphen
position. Filter it out.

Adjust the entity test files to reflect the new behaviour.

This fixes Issue #224 and Debian Bug #830173.

Bug-Debian: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=830173
Bug-Debian: https://github.com/tats/w3m/issues/224
2023-01-04 13:58:10 +01:00
Rene Kita
2692d22006 Fix generated HTML for entity test
- Remove stray elements: </td>
- Add missing elements: <tr></tr>
- Add link to show where to get qjs from
2023-01-04 13:58:10 +01:00
bptato
ffcea626bc Use &gt; instead of &gt in entity test generator 2021-03-01 19:34:58 +01:00
bptato
5cbc514d15 Fix small mistakes in entity test generator 2021-02-28 14:59:03 +01:00
bptato
eacde178f3 Support single-codepoint HTML entities specified by whatwg
https://html.spec.whatwg.org/multipage/named-characters.html#named-character-references
2021-02-28 13:57:43 +01:00
bptato
116e10749c Nested <dl>s 2021-02-13 18:02:26 +01:00
bptato
77ecf9b46b Fix <dl compact> 2021-02-13 17:26:30 +01:00
bptato
ef34bf837c <dl> test 2021-02-13 16:53:01 +01:00
Tatsuya Kinoshita
6339dd9f13 Merge pull request #146 from acli/20200821_a_CLEANED
Patch to make w3m’s handling of the a element HTML5 compatible (when the stream is HTML5)
2020-08-30 09:57:45 +09:00
Ambrose Li
48c9ec565d In HTML5 anchors should not be closed when encountering divs, for example, but should be closed when encountering buttons, for example. Many sites that use HTML5-style anchors end up having links displayed with zero-length link texts. The proposed patch correct this behaviour by detecting whether the document is HTML5, then suppressing the close-anchor action in CLOSE_A if it's an HTML5 document. A new macro handles the HTML5-specific cases where anchors are not already always closed.
This also fixes a bug in the tokenizing FSM in etc.c that prevented the !doctype element from being recognized; the fix is necessary because HTML5 detection depends on checking the !doctype element.
2020-08-24 23:48:09 -04:00
Ambrose Li
9f18e67a9b Cleaned version of 20200823_q branch. Changes the behaviour of the q tag (when m17n and Unicode are configured) to use "smart" quotes if the display charset can handle them. Falls back to old behaviour (ASCII quotes with left/right quote semantics for 6/0 and 2/6) if display charset is us-ascii.
Also changes the behaviour of conv_entity() to convert left/right quotes and some dashes because named entities are needed for the new code for the q tag.
2020-08-23 22:20:43 -04:00