☂ Deploy Wikispeech on beta cluster
Open, Needs TriagePublic
Actions

Assigned To

None

Authored By

	Sebastian_Berlin-WMSE
	Nov 8 2017, 11:05 AM

Description

Deploy the Wikispeech tool on the beta cluster for testing and evaluation. The tool consists of a TTS backend service (Speechoid) and a MediaWiki extension.

After testing on the beta cluster the intention is to have the extension deployed as a beta feature on the ar, en and sv Wikipedias. These are the languages currently supported.

Related Objects
Search...

Status	Assigned	Task
Open	None	T264842 Deploy Wikispeech in production
Open	None	T180015 ☂ Deploy Wikispeech on beta cluster
Open	None	T180021 Security review for extension Wikispeech
Invalid	None	T193072 TTS server deployment strategy
Resolved	Lokal_Profil	T192990 Use Swift for audio-file storage
Declined	None	T242581 Setup Swift server for development
Resolved	Lokal_Profil	T203161 Restrict access to Wikispeech functionality to certain users
Open	None	T264748 ☂ Speechoid WMF deployment
Open	None	T264749 Decide if Speechoid should be a Kubernetes envelope pod or separate services
Resolved	• kalle	T264752 Contact service ops regarding deployment of Speechoid
Open	None	T265280 Create helm chart for Speechoid
Resolved	• kalle	T250357 Contact WMF DBA for best practice regarding Pronlex database
Resolved	None	T258930 Prepare for DBA meeting
Resolved	• kalle	T255128 Run devserver Pronlex on Maria DB
Resolved	None	T253499 ☂Blubber CI pipeline
Resolved	• kalle	T257467 Remove self-tests from startup in Mary-TTS
Resolved	Jopparn	T257470 Introduce Mary-TTS Blubber test build
Resolved	• kalle	T259878 Refactor Blubber prepare to builder-stage
Resolved	• kalle	T259879 Update documentation referring to blubber-prepare.sh
Resolved	• kalle	T259880 Update docker compose project to no longer use blubber-prepare
Resolved	• kalle	T259881 Introduce .pipeline/config.yaml
Resolved	• kalle	T259911 Tell Jenkins and Zuul about the Speechoid pipielines
Resolved	None	T262908 Contact releng regarding requirements for release of blubber docker images
Resolved	• kalle	T265275 Publish Speechoid to docker repo
Open	None	T264753 [DRAFT] Security review for Speechoid service
Resolved	• kalle	T264403 ☂ Benchmark maintenance script
Resolved	• kalle	T247314 Estimate segmenting time
Resolved	• kalle	T247317 Estimate disk space usage of audio files
Resolved	• kalle	T247282 Estimate synthesis time
Resolved	• kalle	T264899 Introduce configurable response timeout in SpeechoidConnector
Resolved	Sebastian_Berlin-WMSE	T264702 Update help page with info about showing player and selection player
Open	None	T265021 Add Wikispeech to make-wmf-branch release tool
Open	None	T265023 Add Wikispeech to extension-list
Open	None	T265041 Add Wikispeech to InitialiseSettings.php and InitialiseSettings-labs.php
Open	None	T265042 Add Wikispeech to CommonSettings.php

Event Timeline

Sebastian_Berlin-WMSE created this task.Nov 8 2017, 11:05 AM

Sebastian_Berlin-WMSE added a subtask: T180021: Security review for extension Wikispeech.Nov 8 2017, 11:37 AM

Sebastian_Berlin-WMSE added a project: Wikispeech-WMSE.

It’s unclear if this service needs to be deployed to production before the extension is deployed on the beta cluster or if this is only needed before the extension gets deployed to production. Should we create a deployment subtask for the service already now?

Sebastian_Berlin-WMSE moved this task from Incoming to Monitoring on the Wikispeech board.Dec 4 2017, 10:29 AM

In T180015#3744046, @Lokal_Profil wrote:

It’s unclear if this service needs to be deployed to production before the extension is deployed on the beta cluster or if this is only needed before the extension gets deployed to production. Should we create a deployment subtask for the service already now?

Probably. You're going to have to write puppet code/modules to actually get it deployable on beta (possibly falls to Release-Engineering-Team to decide/confirm whether we can make the usage on beta depend on the tool in tool labs for definite) and this is definitely the case on production

Also, I'm not sure having something like http://wikispeech-tts.wmflabs.org/ in production, web externally accessible is going to happen.. You do need to speak to SRE about a deployment strategy... If it's not packaged for debian, and the current installation instructions is wget-ing various jars from the internet...

https://www.mediawiki.org/wiki/Extension:Wikispeech#Install_TTS_server

# In ''mishkal/tashkeel/tashkeel.py, c''hange line 385 from:
#* <code>vocalized_text = u" ".join([vocalized_text, self.display(word, format_display)])</code>
#* to:
#* <code>vocalized_text = u" ".join([vocalized_text, self.display(voc_word, format_display)])</code>

*definitely* needs fixing upstream. https://github.com/linuxscout/mishkal/issues/17 has been open since April doesn't inspire me with much confidence

In T180015#3835387, @Reedy wrote:
# In ''mishkal/tashkeel/tashkeel.py, c''hange line 385 from:
#* <code>vocalized_text = u" ".join([vocalized_text, self.display(word, format_display)])</code>
#* to:
#* <code>vocalized_text = u" ".join([vocalized_text, self.display(voc_word, format_display)])</code>
*definitely* needs fixing upstream. https://github.com/linuxscout/mishkal/issues/17 has been open since April doesn't inspire me with much confidence

https://github.com/linuxscout/mishkal/pull/19

And looking at https://www.mediawiki.org/wiki/Extension:Wikispeech#Make_audio_files_accessible

No, we're not going to be writing a directory like this to share the files... It doesn't scale...

Depending on how it works, they should probably be going into Swift like transcodes etc do

And https://www.mediawiki.org/wiki/Extension:Wikispeech#Start_processes_in_screen isn't going to fly in production... Things aren't going to be manually started in a screen session

Reedy mentioned this in T180021: Security review for extension Wikispeech.Dec 13 2017, 9:30 PM

Reedy changed the status of subtask T180021: Security review for extension Wikispeech from Open to Stalled.

Hi there!

Thanks to @Reedy for doing a quick first pass review of this. It seems this is going to need a fair amount of re-architecting to get to a place where it would be available on either Beta Cluster or production.

I see that this is associated with WMSE; does WMSE have funding for doing the needed work (re-architecture etc)? Looks like there will need to be some time for someone from the TechCom to help out?

greg added a comment.Dec 13 2017, 10:08 PM

This comment was removed by greg.

Thanks for the feedback.

In T180015#3835362, @Reedy wrote:

In T180015#3744046, @Lokal_Profil wrote:

It’s unclear if this service needs to be deployed to production before the extension is deployed on the beta cluster or if this is only needed before the extension gets deployed to production. Should we create a deployment subtask for the service already now?

Probably. You're going to have to write puppet code/modules to actually get it deployable on beta (possibly falls to Release-Engineering-Team to decide/confirm whether we can make the usage on beta depend on the tool in tool labs for definite) and this is definitely the case on production

Creating a puppet for the server has stared in T151877. It would be good to know if this is a blocker for the review process, in which case we need to prioritize it.

In T180015#3835372, @Reedy wrote:

[...] and the current installation instructions is wget-ing various jars from the internet...

In T180015#3835387, @Reedy wrote:
# In ''mishkal/tashkeel/tashkeel.py, c''hange line 385 from:
#* <code>vocalized_text = u" ".join([vocalized_text, self.display(word, format_display)])</code>
#* to:
#* <code>vocalized_text = u" ".join([vocalized_text, self.display(voc_word, format_display)])</code>
*definitely* needs fixing upstream. https://github.com/linuxscout/mishkal/issues/17 has been open since April doesn't inspire me with much confidence

These things have been fixed, but the extension page wasn't up to date. I removed the old work arounds and added a link to the installation instructions.

In T180015#3835408, @Reedy wrote:

And https://www.mediawiki.org/wiki/Extension:Wikispeech#Start_processes_in_screen isn't going to fly in production... Things aren't going to be manually started in a screen session

This was intended for running the server in a development environment. Added a comment about this in the documentation.

In T180015#3835942, @greg wrote:

Hi there!

Thanks to @Reedy for doing a quick first pass review of this. It seems this is going to need a fair amount of re-architecting to get to a place where it would be available on either Beta Cluster or production.

I see that this is associated with WMSE; does WMSE have funding for doing the needed work (re-architecture etc)? Looks like there will need to be some time for someone from the TechCom to help out?

Hi @greg! Yes, we will continue to work on the extension also next year. So your feedback is much appreciated. @brion from TechCom has kindly offered us help as well.

Aklapper mentioned this in T190076: Investigate potential process improvements how to get (third party maintained) software deployed on Wikimedia sites.Mar 19 2018, 5:19 PM

Lokal_Profil mentioned this in T192990: Use Swift for audio-file storage.Apr 25 2018, 9:28 AM

akosiaris mentioned this in T193072: TTS server deployment strategy.Apr 26 2018, 7:58 AM

Aklapper mentioned this in T194014: Leave Wikispeech ready for deployment.May 7 2018, 2:51 PM

Theklan subscribed.May 19 2018, 8:23 AM

Reedy added a subtask: T192990: Use Swift for audio-file storage.May 19 2018, 3:59 PM

Just a general update from the WMSE team. We have only been doing minor work on Wikispeech so far this year as we've no funding for the project. We have now secured some funding for a continuation project so we should be able to work on this again after summer.

Not that we are only a small team working on this (mainly me and @Sebastian_Berlin-WMSE ) and that we are both also involved in other projects at WMSE which demand our time. As a result the pace by which things gets done is not always what we would wish for and it is sensitive to deadlines in the other projects. That said Wikispeech is something WMSE is dedicated to continue so despite the at times slow pace development does continue.

bd808 subscribed.Jul 11 2018, 12:39 AM

Lokal_Profil mentioned this in T203161: Restrict access to Wikispeech functionality to certain users.Aug 30 2018, 2:16 PM

Sebastian_Berlin-WMSE edited projects, added Wikispeech-Text-to-Speech; removed Wikispeech.Nov 11 2019, 12:19 PM

Sebastian_Berlin-WMSE moved this task from Unsorted to Monitoring on the Wikispeech-Text-to-Speech board.

Sebastian_Berlin-WMSE mentioned this in T235844: Collect tasks related code and security review.Nov 15 2019, 4:22 PM

Sebastian_Berlin-WMSE added a subtask: T203161: Restrict access to Wikispeech functionality to certain users.Dec 3 2019, 11:33 AM

Addshore subscribed.Jan 20 2020, 3:52 PM

Restricted Application added a project: Wikispeech-Jobrunner. · View Herald TranscriptJan 20 2020, 3:52 PM

Sebastian_Berlin-WMSE moved this task from Incoming to Backlog on the Wikispeech-Jobrunner board.Jan 30 2020, 11:38 AM

Sebastian_Berlin-WMSE closed subtask T203161: Restrict access to Wikispeech functionality to certain users as Resolved.Apr 16 2020, 9:50 AM

Lokal_Profil closed subtask T192990: Use Swift for audio-file storage as Resolved.Sep 23 2020, 9:45 AM

• kalle added a subtask: T264748: ☂ Speechoid WMF deployment.Oct 6 2020, 12:50 PM

• kalle added a subtask: T250357: Contact WMF DBA for best practice regarding Pronlex database.Oct 6 2020, 1:22 PM

• kalle added a subtask: T253499: ☂Blubber CI pipeline.Oct 6 2020, 1:28 PM

• kalle removed a subtask: T250357: Contact WMF DBA for best practice regarding Pronlex database.Oct 6 2020, 1:34 PM

• kalle removed a subtask: T253499: ☂Blubber CI pipeline.

Sebastian_Berlin-WMSE removed a subtask: T264753: [DRAFT] Security review for Speechoid service.Oct 6 2020, 1:36 PM

Sebastian_Berlin-WMSE renamed this task from Deploy extension Wikispeech on beta cluster to ☂ Deploy Wikispeech on beta cluster.Oct 6 2020, 1:45 PM

Sebastian_Berlin-WMSE updated the task description. (Show Details)

Sebastian_Berlin-WMSE mentioned this in T264706: Set up task structure for Wikispeech deployment.Oct 6 2020, 2:01 PM

Hello again. Some years ago we worked implementing the Basque TTS into this extension. Is it included in the current version for the beta cluster? Thanks!

Sebastian_Berlin-WMSE added a subtask: T264702: Update help page with info about showing player and selection player.Oct 7 2020, 6:53 AM

Lokal_Profil changed the status of subtask T180021: Security review for extension Wikispeech from Stalled to Open.Oct 7 2020, 7:28 AM

In T180015#6521667, @Theklan wrote:

Hello again. Some years ago we worked implementing the Basque TTS into this extension. Is it included in the current version for the beta cluster? Thanks!

Hi. Basque is unfortunately not included in what we are hoping to deploy to the beta cluster right now. Please see https://www.mediawiki.org/wiki/Extension_talk:Wikispeech#Basque_language for more details.

Sebastian_Berlin-WMSE added a parent task: T264842: Deploy Wikispeech in production.Oct 7 2020, 8:26 AM

Lokal_Profil mentioned this in T261296: Identify remaining blockers for beta.Oct 7 2020, 9:09 AM

Thanks! @Lokal_Profil

MarcoAurelio closed subtask T265021: Add Wikispeech to make-wmf-branch release tool as Resolved.Oct 8 2020, 4:33 PM

MarcoAurelio reopened subtask T265021: Add Wikispeech to make-wmf-branch release tool as Open.Oct 8 2020, 5:52 PM

Lokal_Profil mentioned this in T264842: Deploy Wikispeech in production.Nov 9 2020, 12:46 PM

Sebastian_Berlin-WMSE closed subtask T264702: Update help page with info about showing player and selection player as Resolved.Jan 14 2021, 12:11 PM

Addshore unsubscribed.Jun 27 2023, 12:39 PM