Caveats with the existing TextExtracts service are documented here:
https://www.mediawiki.org/wiki/Extension:TextExtracts#Caveats
We should also surface these in the API response as warnings where possible.
- When exsentences is used in html mode a warning should be sent to the consumer "exsentences is not guaranteed to work when used in HTML mode."
- In HTML mode for all output show a warning "HTML may be malformed and/or unbalanced and may omit inline images. Use at your own risk. Known problems are listed at https://www.mediawiki.org/wiki/Extension:TextExtracts#Caveats"
- The message should be configurable via messages. WikimediaMessages for instance may want to point users to the alternative RESTBase service.
Questions
- Should we also point them to the RESTbase service in these warning messages?
This is up to the wiki which can customise the message.
- Should we add math to the elements that should be excluded from text extracts and check that no math tags appear inside https://en.wikipedia.org/w/api.php?format=jsonfm&action=query&prop=extracts&titles=Planck%20constant ? Alternatively we keep them - as consumers can parse them and display them if they are capable of doing so.
No: Let's keep the status quo. Changing how the extract behaves may interfere with how people are currently using the API and it provides little gain for ourselves.
Developer notes
You can add warnings like so when certain combos are used:
$this->addWarning( 'i18n-message-key' );
See T170617#4047560 for details.