Registered
2008-08-18
by
dan mackinlay
A BeautifulSoup- (or html5lib-) based library to take arbitrary, malicious, non-standards compliant HTML, and make it sane, sound and regular XHTML
Development for this package was sponsored by the Powerhouse Museum http://
The code is extensively documented and contains lots of doctests; dive in and use it.
Register on launchpad to commit fixes!
Project information
- Maintainer:
- dan mackinlay
- Driver:
- Not yet selected
- Development focus:
- Programming Languages:
- python
- Version control system:
- Bazaar
- Licences:
- Simplified BSD Licence
View full history Series and milestones
Python HTML Sanitizer trunk series is the current focus of development
All bugs Latest bugs reported
-
Bug #305174: Self test fails with UnicodeDecodeError
Reported on 2008-12-04 -
Bug #292401: sanitizer mostly broken when used with html5lib
Reported on 2008-11-01 -
Bug #292397: Fix python2.4 compatibility
Reported on 2008-11-01 -
Bug #287323: html_parse throws an error without html5lib
Reported on 2008-10-22
All questions Latest questions
-
Are there any similar projects to this python HTML sanitizer?
Posted on 2008-08-20
All blueprints Latest blueprints
-
add support for html5lib's parser
Registered on 2008-08-20 -
we need test coverage for common XSS vectors
Registered on 2008-08-18 -
make this installable by easy_install
Registered on 2008-08-18
