- Dropped support for Python 2.7 and 3.5.
- Raised the minimum lxml version to the current 4.6.3.
- Switched from Travis CI to GitHub actions. Added Python 3.9 to the CI matrix.
- Renamed the main branch to main.
- Switched to a declarative setup.
1.9 (2020-01-20)
- Added Python 3.8 to the CI matrix.
- Be able to keep the
<style>tag by adding it totags. - Added a style check to the CI matrix.
1.8 (2019-11-21)
- Actually added support for customizing lxml's autolinking behavior using a dictionary argument.
- Stopped removing explicitly allowed attributes.
- Removed
idfrom allowed attributes of<a>tags to provide an additional layer of defense against DOM clobbering attacks. - Added an element preprocessor which assigns the
idvalue to thenameattribute of anchors ifnameisn't set or empty. This should provide additional backwards compatibility making theidremoval less of a problem when using named anchors.
1.7 (2019-02-19)
- Added a system check which validates sanitizer configurations early when using Django.
- Fixed an edge case where passing in an empty allowed tags list would unexpectedly and silently not remove any tags at all (because that's the way lxml's cleaner works).
- Changed the sanitizer
tags,emptyandseparateoptions to also accept any iterable, not just sets. - Changed the
lru_cacheimport in the Django module to tryfunctoolsfirst. - Fixed the tag merging to also check tags in
empty. This means that e.g. consecutive<hr>tags are also merged now when using the default settings. - Made it possible to override the set of tags processed as whitespace.
The default set is
{"br"}which preserves the current behavior of stripping breaks from the beginning or end of tags' content.
1.6 (2018-06-29)
- Fixed another edge case where a tag which is allowed to be empty was
erroneously removed if it contained not only whitespace but also a
<br>tag.
1.5 (2018-06-01)
- Fixed a few edge whitespace normalization edge cases and a bug where removing an empty tag removed all whitespace.
- Added black for automatically formatting the Python code.
- By default, links with
target="_blank"get an additionalrel="noopener"attribute (Article by Mathias Bynens). If you're overriding the list of allowed attributes for anchor tags you must addrelto your list.
1.4 (2018-03-29)
- Corrected the required lxml version in
install_requires. - Added comments and testing for more edge cases.
- Changed the cleaner to not drop form elements; instead,
<form>is converted to<p>, and form elements are preserved. - Added an
is_mergeablehook for conditionally preventing the merging of adjacent elements. - Fixed a case where paragraphs were allowed inside paragraphs (which was never the idea).
1.3 (2017-09-22)
- Fixed a case where tags with content between them were erroneously merged.
- Added a
tox.inifile for running style checks and tests. - Replaced
REPLACEMENTSandelement_filterswith the more generalelement_preprocessorsandelement_postprocessorssettings. - Removed the restriction that
<span>tags are never allowed.
1.2 (2017-05-25)
- Fixed the erroneous removal of all whitespace between adjacent elements.
- Fixed a few occasions where
<br>tags were erroneously removed. - Back to beautifulsoup4 for especially broken HTML respectively HTML with Emojis on macOS.
- Used a
<div>instead of<anything>to wrap the document (since beautifulsoup4 does not like custom tags too much)
1.1 (2017-05-02)
- Added
html_sanitizer.django.get_sanitizerto provide an official way of configuring HTML sanitizers using Django settings.
1.0 (2017-05-02)
- Initial public release.