[new features]
* Added `docx` as an input format (Jesse Rosenthal). The docx
reader includes conversion of native Word equations to pandoc
LaTeX `Math` elements. Metadata is taken from paragraphs at the
beginning of the document with styles `Author`, `Title`, `Subtitle`,
`Date`, and `Abstract`.
* Added `epub` as an input format (Matthew Pickering). The epub
reader includes conversion of MathML to pandoc LaTeX `Math`
elements.
* Added `t2t` (Txt2Tags) as an input format (Matthew Pickering).
Txt2tags is a lightweight markup format described at
<http://txt2tags.org/>.
* Added `dokuwiki` as an output format (Clare Macrae).
* Added `haddock` as an output format.
* Added `--extract-media` option to extract media contained in a zip
container (docx or epub) while adjusting image paths to point to the
extracted images.
* Added a new markdown extension, `compact_definition_lists`, that
restores the syntax for definition lists of pandoc 1.12.x, allowing
tight definition lists with no blank space between items, and
disallowing lazy wrapping. (See below under behavior changes.)
* Added an extension `epub_html_exts` for parsing HTML in EPUBs.
* Added extensions `native_spans` and `native_divs` to activate
parsing of material in HTML span or div tags as Pandoc Span
inlines or Div blocks.
* `--trace` now works with the Markdown, HTML, Haddock, EPUB,
Textile, and MediaWiki readers. This is an option intended
for debugging parsing problems; ordinary users should not need
to use it.
[behavior changes]
* Changed behavior of the `markdown_attribute` extension, to bring
it in line with PHP markdown extra and multimarkdown. Setting
`markdown="1"` on an outer tag affects all contained tags,
recursively, until it is reversed with `markdown="0"` (1378).
* Revised markdown definition list syntax (1429). Both the reader
and writer are affected. This change brings pandoc's definition list
syntax into alignment with that used in PHP markdown extra and
multimarkdown (with the exception that pandoc is more flexible about
the definition markers, allowing tildes as well as colons). Lazily
wrapped definitions are now allowed. Blank space is required
between list items. The space before a definition is used to determine
whether it is a paragraph or a "plain" element. **WARNING: This change
may break existing documents!** Either check your documents for
definition lists without blank space between items, or use
`markdown+compact_definition_lists` for the old behavior.
* `.numberLines` now works in fenced code blocks even if no language
is given (1287, jgm/highlighting-kate40).
* Improvements to `--filter`:
+ Don't search PATH for a filter with an explicit path.
This fixed a bug wherein `--filter ./caps.py` would run `caps.py` from
the system path, even if there was a `caps.py` in the working directory.
+ Respect shebang if filter is executable (1389).
+ Don't print misleading error message.
Previously pandoc would say that a filter was not found,
even in a case where the filter had a syntax error.
* HTML reader:
+ Parse `div` and `span` elements even without `--parse-raw`,
provided `native_divs` and `native_spans` extensions are set.
Motivation: these now generate native pandoc Div and Span
elements, not raw HTML.
+ Parse EPUB-specific elements if the `epub_html_exts`
extension is enabled. These include `switch`, `footnote`,
`rearnote`, `noteref`.
* Org reader:
+ Support for inline LaTeX. Inline LaTeX is now accepted and parsed by the
org-mode reader. Both math symbols (like `\tau`) and LaTeX commands (like
`\cite{Coffee}`), can be used without any further escaping (Albert
Krewinkel).
* Textile reader and writer:
+ The `raw_tex` extension is no longer set by default. You can
enable it with `textile+raw_tex`.
* DocBook reader:
+ Support `equation`, `informalequation`, `inlineequation` elements with
`mml:math` content. This is converted into LaTeX and put into a Pandoc
Math inline.
* Revised `plain` output, largely following the style of Project
Gutenberg:
+ Emphasis is rendered with `_underscores_`, strong emphasis
with ALL CAPS.
+ Headings are rendered differently, with space to set them off,
not with setext style underlines. Level 1 headers are ALL CAPS.
+ Math is rendered using unicode when possible, but without the
distracting emphasis markers around variables.
+ Footnotes use a regular `[n]` style.
* Markdown writer:
+ Horizontal rules are now a line across the whole page.
+ Prettier pipe tables. Columns are now aligned (1323).
+ Respect the `raw_html` extension. `pandoc -t markdown-raw_html`
no longer emits any raw HTML, including span and div tags
generated by Span and Div elements.
+ Use span with style for `SmallCaps` (1360).
* HTML writer:
+ Autolinks now have class `uri`, and email autolinks have class
`email`, so they can be styled.
* Docx writer:
+ Document formatting is carried over from `reference.docx`.
This includes margins, page size, page orientation, header,
and footer, including images in headers and footers.
+ Include abstract (if present) with `Abstract` style (1451).
+ Include subtitle (if present) with `Subtitle` style, rather
than tacking it on to the title (1451).
* Org writer:
+ Write empty span elements with an id attribute as org anchors.
For example `Span ("uid",[],[]) []` becomes `<<uid>>`.
* LaTeX writer:
+ Put table captions above tables, to match the conventional
standard. (Previously they appeared below tables.)
+ Use `\(..\)` instead of `$..$` for inline math (1464).
+ Use `\nolinkurl` in email autolinks. This allows them to be styled
using `\urlstyle{tt}`. Thanks to Ulrike Fischer for the solution.
+ Use `\textquotesingle` for `'` in inline code. Otherwise we get
curly quotes in the PDF output (1364).
+ Use `\footnote<.>{..}` for notes in beamer, so that footnotes
do not appear before the overlays in which their markers appear
(1525).
+ Don't produce a `\label{..}` for a Div or Span element. Do produce
a `\hyperdef{..}` (1519).
* EPUB writer:
+ If the metadata includes `page-progression-direction` (which can be
`ltr` or `rtl`, the `page-progression-direction` attribute will
be set in the EPUB spine (1455).
* Custom lua writers:
+ Custom writers now work with `--template`.
+ Removed HTML header scaffolding from `sample.lua`.
+ Made citation information available in lua writers.
* `--normalize` and `Text.Pandoc.Shared.normalize` now consolidate
adjacent `RawBlock`s when possible.
[API changes]
* Added `Text.Pandoc.Readers.Docx`, exporting `readDocx` (Jesse Rosenthal).
* Added `Text.Pandoc.Readers.EPUB`, exporting `readEPUB` (Matthew
Pickering).
* Added `Text.Pandoc.Readers.Txt2Tags`, exporting `readTxt2Tags` (Matthew
Pickering).
* Added `Text.Pandoc.Writers.DokuWiki`, exporting `writeDokuWiki`
(Clare Macrae).
* Added `Text.Pandoc.Writers.Haddock`, exporting `writeHaddock`.
* Added `Text.Pandoc.MediaBag`, exporting `MediaBag`, `lookupMedia`,
`insertMedia`, `mediaDirectory`, `extractMediaBag`. The docx and epub
readers return a pair of a `Pandoc` document and a `MediaBag` with
the media resources they contain. This can be extracted using
`--extract-media`. Writers that incorporate media (PDF, Docx,
ODT, EPUB, RTF, or HTML formats with `--self-contained`) will look
for resources in the `MediaBag` generated by the reader, in addition to
the file system or web.
* `Text.Pandoc.Readers.TexMath`: Removed deprecated `readTeXMath`.
Renamed `readTeXMath'` to `texMathToInlines`.
* `Text.Pandoc`: Added `Reader` data type (Matthew Pickering).
`readers` now associates names of readers with `Reader`
structures. This allows inclusion of readers, like the docx
reader, that take binary rather than textual input.
* `Text.Pandoc.Shared`:
+ Added `capitalize` (Artyom Kazak), and replaced uses of
`map toUpper` (which give bad results for many languages).
+ Added `collapseFilePath`, which removes intermediate `.` and
`..` from a path (Matthew Pickering).
+ Added `fetchItem'`, which works like `fetchItem` but searches
a `MediaBag` before looking on the net or file system.
+ Added `withTempDir`.
+ Added `removeFormatting`.
+ Added `extractSpaces` (from HTML reader) and generalized its type
so that it can be used by the docx reader (Matthew Pickering).
+ Added `ordNub`.
+ Added `normalizeInlines`, `normalizeBlocks`.
+ `normalize` is now `Pandoc -> Pandoc` instead of
`Data a :: a -> a`. Some users may need to change their uses of
`normalize` to the newly exported `normalizeInlines` or
`normalizeBlocks`.
* `Text.Pandoc.Options`:
+ Added `writerMediaBag` to `WriterOptions`.
+ Removed deprecated and no longer used `readerStrict` in
`ReaderOptions`. This is handled by `readerExtensions` now.
+ Added `Ext_compact_definition_lists`.
+ Added `Ext_epub_html_exts`.
+ Added `Ext_native_divs` and `Ext_native_spans`.
This allows users to turn off the default pandoc behavior of
parsing contents of div and span tags in markdown and HTML
as native pandoc Div blocks and Span inlines.
* `Text.Pandoc.Parsing`:
+ Generalized `readWith` to `readWithM` (Matthew Pickering).
+ Export `runParserT` and `Stream` (Matthew Pickering).
+ Added `HasQuoteContext` type class (Matthew Pickering).
+ Generalized types of `mathInline`, `smartPunctuation`, `quoted`,
`singleQuoted`, `doubleQuoted`, `failIfInQuoteContext`,
`applyMacros` (Matthew Pickering).
+ Added custom `token` (Matthew Pickering).
+ Added `stateInHtmlBlock` to `ParserState`. This is used to keep
track of the ending tag we're waiting for when we're parsing inside
HTML block tags.
+ Added `stateMarkdownAttribute` to `ParserState`. This is used
to keep track of whether the markdown attribute has been set in
an enclosing tag.
+ Generalized type of `registerHeader`, using new type classes
`HasReaderOptions`, `HasIdentifierList`, `HasHeaderMap` (Matthew
Pickering). These allow certain common functions to be reused
even in parsers that use custom state (instead of `ParserState`),
such as the MediaWiki reader.
+ Moved `inlineMath`, `displayMath` from Markdown reader to Parsing,
and generalized their types (Matthew Pickering).
* `Text.Pandoc.Pretty`:
+ Added `nestle`.
+ Added `blanklines`, which guarantees a certain number of blank lines
(and no more).
[bug fixes]
* Markdown reader:
+ Fixed parsing of indented code in list items. Indented code
at the beginning of a list item must be indented eight spaces
from the margin (or edge of the container), or four spaces
from the list marker, whichever is greater.
+ Fixed small bug in HTML parsing with `markdown_attribute`, which
caused incorrect tag nesting for input like
`<aside markdown="1">*hi*</aside>`.
+ Fixed regression with intraword underscores (1121).
+ Improved parsing of inline links containing quote characters (1534).
+ Slight rewrite of `enclosure`/`emphOrStrong` code.
+ Revamped raw HTML block parsing in markdown (1330).
We no longer include trailing spaces and newlines in the
raw blocks. We look for closing tags for elements (but without
backtracking). Each block-level tag is its own `RawBlock`;
we no longer try to consolidate them (though `--normalize` will do so).
+ Combine consecutive latex environments. This helps when you have
two minipages which can't have blank lines between them (690, 1196).
+ Support smallcaps through span.
`<span style="font-variant:small-caps;">foo</span>` will be
parsed as a `SmallCaps` inline, and will work in all output
formats that support small caps (1360).
+ Prevent spurious line breaks after list items (1137). When the
`hard_line_breaks` option was specified, pandoc would formerly
produce a spurious line break after a tight list item.
+ Fixed table parsing bug (1333).
+ Handle `c++` and `objective-c` as language identifiers in
github-style fenced blocks (1318).
+ Inline math must have nonspace before final `$` (1313).
* LaTeX reader:
+ Handle comments at the end of tables. This resolves the issue
illustrated in <http://stackoverflow.com/questions/24009489>.
+ Correctly handle table rows with too few cells. LaTeX seems to
treat them as if they have empty cells at the end (241).
+ Handle leading/trailing spaces in `\emph` better.
`\emph{ hi }` gets parsed as `[Space, Emph [Str "hi"], Space]`
so that we don't get things like `* hi *` in markdown output.
Also applies to `\textbf` and some other constructions (1146).
+ Don't assume preamble doesn't contain environments (1338).
+ Allow (and discard) optional argument for `\caption` (James Aspnes).
* HTML reader:
+ Fixed major parsing problem with HTML tables. Table cells were
being combined into one cell (1341).
+ Fixed performance issue with malformed HTML tables.
We let a `</table>` tag close an open `<tr>` or `<td>` (1167).
+ Allow space between `<col>` and `</col>`.
+ Added `audio` and `source` in `eitherBlockOrInline`.
+ Moved `video`, `svg`, `progress`, `script`, `noscript`, `svg` from
`blockTags` to `eitherBlockOrInline`.
+ `map` and `object` were mistakenly in both lists; they have been removed
from `blockTags`.
+ Ignore `DOCTYPE` and `xml` declarations.
* MediaWiki reader:
+ Don't parse backslash escapes inside `<source>` (1445).
+ Tightened up template parsing.
The opening `{{` must be followed by an alphanumeric or `:`.
This prevents the exponential slowdown in 1033.
+ Support "Bild" for images.
* DocBook reader:
+ Better handle elements inside code environments. Pandoc's document
model does not allow structure inside code blocks, but at least this way
we preserve the text (1449).
+ Support `<?asciidoc-br?>` (1236).
* Textile reader:
+ Fixed list parsing. Lists can now start without an intervening
blank line (1513).
+ HTML block-level tags that do not start a line are parsed as
inline HTML and do not interrupt paragraphs (as in RedCloth).
* Org reader:
+ Make tildes create inline code (1345). Also relabeled `code` and
`verbatim` parsers to accord with the org-mode manual.
+ Respect `:exports` header argument in code blocks (Craig Bosma).
+ Fixed tight lists with sublists (1437).
* EPUB writer:
+ Avoid excess whitespace in `nav.xhtml`. This should improve
TOC view in iBooks (1392).
+ Fixed regression on cover image.
In 1.12.4 and 1.12.4.2, the cover image would not appear properly,
because the metadata id was not correct. Now we derive the id from the
actual cover image filename, which we preserve rather than using
"cover-image."
+ Keep newlines between block elements. This allows
easier diff-ability (1424).
+ Use `stringify` instead of custom `plainify`.
+ Use `renderTags'` for all tag rendering. This properly handles tags
that should be self-closing. Previously `<hr/>` would appear in EPUB
output as `<hr></hr>` (1420).
+ Better handle HTML media tags.
+ Handle multiple dates with OPF `event` attributes. Note: in EPUB3 we
can have only one dc:date, so only the first one is used.
* LaTeX writer:
+ Correctly handle figures in notes. Notes can't contain figures in
LaTeX, so we fake it to avoid an error (1053).
+ Fixed strikeout + highlighted code (1294).
Previously strikeout highlighted code caused an error.
* ConTeXt writer:
+ Improved detection of autolinks with URLs containing escapes.
* RTF writer:
+ Improved image embedding: `fetchItem'` is now used to get the
images, and calculated image sizes are indicated in the RTF.
+ Avoid extra paragraph tags in metadata (1421).
* HTML writer:
+ Deactivate "incremental" inside slide speaker notes (1394).
+ Don't include empty items in the table of contents for
slide shows. (These would result from creating a slide
using a horizontal rule.)
* MediaWiki writer:
+ Minor renaming of `st` prefixed names.
* AsciiDoc writer:
+ Double up emphasis and strong emphasis markers in intraword
contexts, as required by asciidoc (1441).
* Markdown writer:
+ Avoid wrapping that might start a list, blockquote, or header (1013).
+ Use Span instead of (hackish) `SmallCaps` in `plainify`.
+ Don't use braced attributes for fenced code (1416).
If `Ext_fenced_code_attributes` is not set, the first class
attribute will be printed after the opening fence as a bare word.
+ Separate adjacent lists of the same kind with an HTML comment (1458).
* PDF writer:
+ Fixed treatment of data uris for images (1062).
* Docx writer:
+ Use Compact style for empty table cells (1353).
Otherwise we get overly tall lines when there are empty
table cells and the other cells are compact.
+ Create overrides per-image for `media/` in reference docx.
This should be somewhat more robust and cover more types of images.
+ Improved `entryFromArchive` to avoid an unneeded parse.
+ Section numbering carries over from reference.docx (1305).
+ Simplified `abstractNumId` numbering. Instead of sequential numbering,
we assign numbers based on the list marker styles.
* `Text.Pandoc.Options`:
+ Removed `Ext_fenced_code_attributes` from `markdown_github`
extensions.
* `Text.Pandoc.ImageSize`:
+ Use default instead of failing if image size not found
in exif header (1358).
+ ignore unknown exif header tag rather than crashing.
Some images seem to have tag type of 256, which was causing
a runtime error.
* `Text.Pandoc.Shared`:
+ `fetchItem`: unescape URI encoding before reading local file (1427).
+ `fetchItem`: strip a fragment like `?iefix` from the extension before
doing mime lookup, to improve mime type guessing.
+ Improved logic of `fetchItem`: absolute URIs are fetched from the net;
other things are treated as relative URIs if `sourceURL` is `Just _`,
otherwise as file paths on the local file system.
+ `fetchItem` now properly handles links without a protocol (1477).
+ `fetchItem` now escapes characters not allowed in URIs before trying
to parse the URIs.
+ Fixed runtime error with `compactify'DL` on certain lists (1452).
* `pandoc.hs`: Don't strip path off of `writerSourceURL`: the path is
needed to resolve relative URLs when we fetch resources (750).
* `Text.Pandoc.Parsing`
+ Simplified `dash` and `ellipsis` (1419).
+ Removed `(>>~)` in favor of the equivalent `(<*)` (Matthew Pickering).
+ Generalized functions to use `ParsecT` (Matthew Pickering).
+ Added `isbn` and `pmid` to list of recognized schemes (Matthew
Pickering).
[template changes]
* Added haddock template.
* EPUB3: Added `type` attribute to `link` tags. They are supposed to
be "advisory" in HTML5, but kindlegen seems to require them.
* EPUB3: Put title page in section with `epub:type="titlepage"`.
* LaTeX: Made `\subtitle` work properly (1327).
* LaTeX/Beamer: remove conditional around date (1321).
* LaTeX: Added `lot` and `lof` variables, which can be set to
get `\listoftables` and `\listoffigures` (1407). Note that
these variables can be set at the command line with `-Vlot -Vlof`
or in YAML metadata.
[under the hood improvements]
* Rewrote normalize for efficiency (1385).
* Rewrote Haddock reader to use `haddock-library` (1346).
+ This brings pandoc's rendering of haddock markup in line
with the new haddock.
+ Fixed line breaks in `` code blocks.
+ alex and happy are no longer build-depends.
* Added `Text.Pandoc.Compat.Directory` to allow building against
different versions of the `directory` library.
+ Added `Text.Pandoc.Compat.Except` to allow building against
different verions of `mtl`.
* Code cleanup in some writers, using Reader monad to avoid
passing options parameter around (Matej Kollar).
* Improved readability in `pandoc.hs`.
* Miscellaneous code cleanups (Artyom Kazak).
* Avoid `import Prelude hiding (catch)` (1309, thanks to Michael
Thompson).
* Changed `http-conduit` flag to `https`. Depend on `http-client`
and `http-client-tls` instead of `http-conduit`. (Note: pandoc still
depends on `conduit` via `yaml`.)
* Require `highlighting-kate >= 0.5.8.5` (1271, 1317, Debian 753299).
This change to highlighting-kate means that PHP fragments no longer need
to start with `<?php`. It also fixes a serious bug causing failures with
ocaml and fsharp.
* Require latest `texmath`. This fixes `\tilde{E}` and allows
`\left` to be used with `]`, `)` etc. (1319), among many other
improvements.
* Require latest `zip-archive`. This has fixes for unicode path names.
* Added tests for plain writer.
* `Text.Pandoc.Templates`:
+ Fail informatively on template syntax errors.
With the move from parsec to attoparsec, we lost good error
reporting. In fact, since we weren't testing for end of input,
malformed templates would fail silently. Here we revert back to
Parsec for better error messages.
+ Use `ordNub` (1022).
* Benchmarks:
+ Made benchmarks compile again (Artyom Kazak).
+ Fixed so that the failure of one benchmark does not prevent others
from running (Artyom Kazak).
+ Use `nfIO` instead of the `getLength` trick to force full evaluation.
+ Changed benchmark to use only the test suite, so that benchmarks
run more quickly.
* Windows build script:
+ Add `-windows` to file name.
+ Use one install command for pandoc, pandoc-citeproc.
+ Force install of pandoc-citeproc.
* `make_osx_package`: Call zip file `pandoc-VERSION-osx.zip`.
The zip should not be named `SOMETHING.pkg.zip`, or OSX finder
will extract it into a folder named `SOMETHING.pkg`, which it
will interpret as a defective package (1308).
* `README`:
+ Made headers for all extensions so they have IDs and can be
linked to (Beni Cherniavsky-Paskin).
+ Fixed typos (Phillip Alday).
+ Fixed documentation of attributes (1315).
+ Clarified documentation on small caps (1360).
+ Better documentation for `fenced_code_attributes` extension
(Caleb McDaniel).
+ Documented fact that you can put YAML metadata in a separate file
(1412).