This also fixes a bug in the tokenizing FSM in etc.c that prevented the !doctype element from being recognized; the fix is necessary because HTML5 detection depends on checking the !doctype element.
Also changes the behaviour of conv_entity() to convert left/right quotes and some dashes because named entities are needed for the new code for the q tag.