This is a result of T233287: WMSE site visit at WMDE for exchange/collaboration around Wikispeech
Backgrund
One of the two stateless features of the Wikispeech backend (Speechoid) is the lexicon of pronunciations. This lexicon is part of the Pronlex component which deals with the management of the lexicon as well as its usage during speech synthesis.
Porting Pronlex (or the required bits thereof) to the MediaWiki extension part of Wikispeech has been suggested as a solution to this.
This has the additional benefit of being a good preparatory step for adding the option of using Wikidata (or any Wikibase instance) instead of the lexicon [out of scope for this task]. It would also make use of mechanisms for handling databases built into MediaWiki.
Since pronlex is the only component written in Go this would also have the benefit of reducing the number of languages used by Wikispeech.
An identified downside of the porting is that Pronlex is today usable even without MediaWiki. Porting it would essentially result in forking the project likely resulting in both having to be maintained in parallel.
If the porting does not take place Pronlex would still have to be updated to be brought in line with Wikimedia/MediaWiki expectations e.g. going from Sqlite3 to Mysql and setting up a mechanism where database reads can be done from a slave server [out of scope for this task].
To investigate
What has to be clarified in order to make a decision on porting is
- exactly what jobs are done by Pronlex as part of the speech rendering cycle,
- an estimate of the time/effort required for porting the required parts of Pronlex,
- an estimate of the time/effort required for updating wikispeech_mockup to expect lexicon handling to have been done prior to being called.
Notes and mockups from the internal meetings can be found here.