Background: There are some changes to the display of ebooks I believe caused by the turning on of an option in calibre (https://github.com/wsexport/tool/pull/297). Only affects formats which are produced by Calibre, like PDF, Mobi. Does not affect epubs. I believe the change was in order to improve performance of pdf generation.
Examples include:
- The option tries to guess linebreaks in text, sometimes leading to undesirable results. Example below from https://de.wikisource.org/wiki/Bremens_edle_Tochter (https://wsexport-test.wmflabs.org/?lang=de&page=Bremens+edle+Tochter&format=pdf-a4&fonts=):
- It tries to autodetect headlines. Example from https://en.wikisource.org/api/rest_v1/page/html/John_James_Audubon%2FChronology (https://wsexport-test.wmflabs.org/?lang=en&page=John_James_Audubon/Chronology&format=pdf-a4&fonts=):
- It also does some nice things like adding italic text and making hyphenation consistent across the text. Example again from https://en.wikisource.org/wiki/John_James_Audubon/I. (https://wsexport-test.wmflabs.org/?lang=en&page=John_James_Audubon/I.&format=pdf-a4&fonts=):
- It does a few other things as well, more details here https://manual.calibre-ebook.com/conversion.html#heuristic-processing. There are probably other differences I have not found yet.