Page MenuHomePhabricator

Extension is owner of default voice per language logic
Closed, ResolvedPublic

Description

Specify a default voice in the config for each language. If not specified, the first voice in the list of available voices could be used. Voice should always be sent in Speechoid requests.

Outdated description

With T257659: add language and voice to wikispeech-server output Speechoid will start returning the voice used for synthesis. This is needed in the case where the default voice is used and the Wikispeech-extension therefore does not have knowledge of that.

The scope of this task is to:

  1. Use this part of the response to set the voice used in UtteranceStore. This response should be used even if a voice was set, in case Speechoid starts aliasing voices in the future.
  2. Drop the associated hack in gerrit:607285
  3. Consider if logging is suitable if the returned voice does not match the requested voice OR the returned voice is not in the list of selectable voices
  4. Figure out how a lookup can be done against UtteranceStore when using the default voice.

Event Timeline

I think we didn't think this through enough. We still need the default voice name when requesting the cached utterance from utterance store. If the default-voice ownership resides in Speechoid, then we also need to be able to fetch this data from Wikispeech.

		if ( !$voice ) {
			// @todo We need to lookup what the default voice for this language is.
			// 1. Check default language in WAN cache.
			// 2. If not available, request default voice for language from Speechoid.
			// 2.1 Set in cache. Rather short TTL.
			// 3. Use this voice.
			// @todo Consider what happens if default voice differs between deployed
			// instances of Speechoid, e.g. during update. Should we perhaps pass down
			// the default voice to the client on the initial request to ensure the
			// same default voice per user session?
			throw new InvalidArgumentException( 'Voice must be explicitly set.' );
		}

To clarify, we can't just store 'default' as voice in utterance store as that would cause a mix of voices if the default changes in Speechoid.

Would be simpler just to add a default value in the config? It would require an extra check that the value is present and one of the available voices, but that shouldn't be a big thing to add. It's probably enough to add it to WikispeechHooks::shouldWikispeechRun().

Sebastian_Berlin-WMSE renamed this task from Make use of "voice" in Speechoid response to Specify default voice per language.Jul 23 2020, 9:11 AM
Sebastian_Berlin-WMSE updated the task description. (Show Details)
Sebastian_Berlin-WMSE set the point value for this task to 4.
Sebastian_Berlin-WMSE moved this task from Incoming to Backlog on the Wikispeech-Jobrunner board.
kalle renamed this task from Specify default voice per language to Extension is owner of default voice per language.Aug 3 2020, 12:18 PM

Change 618051 had a related patch set uploaded (by Karl Wettin (WMSE); owner: Karl Wettin (WMSE)):
[mediawiki/extensions/Wikispeech@master] Extension is owner of default voice per language

https://gerrit.wikimedia.org/r/618051

@kalle
While I believe the default voices should be owned by Speechoid (as the bit of the software which is responsible for synthesis) I agree that this is the best solution to our immediate problem.

Once T257659: add language and voice to wikispeech-server output is implemented I still believe "1. Use this part of the response to set the voice used in UtteranceStore." holds true, in case of aliasing etc.

Once T260051: Add endpoint listing default voice per language is implemented we should include some info about how this can be used to set up the original configuration. But at this point it might also be worth having a job which every X requests updates the local defaults based on that list, to move ownership back to Speechoid.

Change 618051 merged by jenkins-bot:
[mediawiki/extensions/Wikispeech@master] Extension is owner of default voice per language logic

https://gerrit.wikimedia.org/r/618051

kalle renamed this task from Extension is owner of default voice per language to Extension is owner of default voice per language logic.Aug 13 2020, 2:29 PM
kalle moved this task from 🤠 This week to 🤯 Done on the User-kalle board.