Changed
Refactor TOC Sanitation
* All postprocessors are now run on heading content.
* Footnote references are now stripped from heading content. Fixes 660.
* A more robust `striptags` is provided to convert headings to plain text.
Unlike, the `markupsafe` implementation, HTML entities are not unescaped.
* The plain text `name`, rich `html`, and unescaped raw `data-toc-label` are
saved to `toc_tokens`, allowing users to access the full rich text content of
the headings directly from `toc_tokens`.
* The value of `data-toc-label` is sanitized separate from heading content
before being written to `name`. This fixes a bug which allowed markup through
in certain circumstances. To access the raw unsanitized data, retrieve the
value from `token['data-toc-label']` directly.
* An `html.unescape` call is made just prior to calling `slugify` so that
`slugify` only operates on Unicode characters. Note that `html.unescape` is
not run on `name`, `html`, or `data-toc-label`.
* The functions `get_name` and `stashedHTML2text` defined in the `toc` extension
are both **deprecated**. Instead, third party extensions should use some
combination of the new functions `run_postprocessors`, `render_inner_html` and
`striptags`.
Fixed
* Include `scripts/*.py` in the generated source tarballs (1430).
* Ensure lines after heading in loose list are properly detabbed (1443).
* Give smarty tree processor higher priority than toc (1440).
* Permit carets (`^`) and square brackets (`]`) but explicitly exclude
backslashes (`\`) from abbreviations (1444).
* In attribute lists (`attr_list`, `fenced_code`), quoted attribute values are
now allowed to contain curly braces (`}`) (1414).