add support for html5lib's parser

Registered by dan mackinlay

http://code.google.com/p/html5lib/ has a python parser with better invalid markup prevention than BrauifulSoups's native one that can produce BeautifulSoup-compatible output. we should use it as the parse (optionally?)

Blueprint information

Status:
Started
Approver:
None
Priority:
Undefined
Drafter:
None
Direction:
Needs approval
Assignee:
None
Definition:
Approved
Series goal:
None
Implementation:
Beta Available
Milestone target:
None
Started by
dan mackinlay

Related branches

Sprints

Whiteboard

we now do use html5lib internally where it is available. the list of tags that is accepted could now be altered to reflect the sanitization assumptions of html5lib - http://wiki.whatwg.org/wiki/Sanitization_rules - although that is such a large and liberal set of tags that this probably makes no difference to the bulk of users.

(?)

Work Items

This blueprint contains Public information 
Everyone can see this information.

Subscribers

No subscribers.