Page MenuHomePhabricator

Deploy extension Wikispeech on beta cluster
Open, Needs TriagePublic

Description

The extension relies on an external service, currently running at http://wikispeech-tts.wmflabs.org/.

After testing on the beta cluster the intention is to have the extension deployed as a beta feature on the ar, en and sv Wikipedias.

Event Timeline

It’s unclear if this service needs to be deployed to production before the extension is deployed on the beta cluster or if this is only needed before the extension gets deployed to production. Should we create a deployment subtask for the service already now?

Reedy added a subscriber: Reedy.EditedDec 13 2017, 6:42 PM

It’s unclear if this service needs to be deployed to production before the extension is deployed on the beta cluster or if this is only needed before the extension gets deployed to production. Should we create a deployment subtask for the service already now?

Probably. You're going to have to write puppet code/modules to actually get it deployable on beta (possibly falls to Release-Engineering-Team to decide/confirm whether we can make the usage on beta depend on the tool in tool labs for definite) and this is definitely the case on production

Reedy added a comment.Dec 13 2017, 6:45 PM

Also, I'm not sure having something like http://wikispeech-tts.wmflabs.org/ in production, web externally accessible is going to happen.. You do need to speak to Operations about a deployment strategy... If it's not packaged for debian, and the current installation instructions is wget-ing various jars from the internet...

https://www.mediawiki.org/wiki/Extension:Wikispeech#Install_TTS_server

Reedy added a comment.Dec 13 2017, 6:47 PM
# In ''mishkal/tashkeel/tashkeel.py, c''hange line 385 from:
#* <code>vocalized_text = u" ".join([vocalized_text, self.display(word, format_display)])</code>
#* to:
#* <code>vocalized_text = u" ".join([vocalized_text, self.display(voc_word, format_display)])</code>

*definitely* needs fixing upstream. https://github.com/linuxscout/mishkal/issues/17 has been open since April doesn't inspire me with much confidence

Reedy added a comment.Dec 13 2017, 6:50 PM
# In ''mishkal/tashkeel/tashkeel.py, c''hange line 385 from:
#* <code>vocalized_text = u" ".join([vocalized_text, self.display(word, format_display)])</code>
#* to:
#* <code>vocalized_text = u" ".join([vocalized_text, self.display(voc_word, format_display)])</code>

*definitely* needs fixing upstream. https://github.com/linuxscout/mishkal/issues/17 has been open since April doesn't inspire me with much confidence

https://github.com/linuxscout/mishkal/pull/19

Reedy added a comment.Dec 13 2017, 6:54 PM

And looking at https://www.mediawiki.org/wiki/Extension:Wikispeech#Make_audio_files_accessible

No, we're not going to be writing a directory like this to share the files... It doesn't scale...

Depending on how it works, they should probably be going into Swift like transcodes etc do

And https://www.mediawiki.org/wiki/Extension:Wikispeech#Start_processes_in_screen isn't going to fly in production... Things aren't going to be manually started in a screen session

greg added a subscriber: greg.Dec 13 2017, 10:07 PM

Hi there!

Thanks to @Reedy for doing a quick first pass review of this. It seems this is going to need a fair amount of re-architecting to get to a place where it would be available on either Beta Cluster or production.

I see that this is associated with WMSE; does WMSE have funding for doing the needed work (re-architecture etc)? Looks like there will need to be some time for someone from the TechCom to help out?

greg added a comment.Dec 13 2017, 10:08 PM
This comment was removed by greg.

Thanks for the feedback.

It’s unclear if this service needs to be deployed to production before the extension is deployed on the beta cluster or if this is only needed before the extension gets deployed to production. Should we create a deployment subtask for the service already now?

Probably. You're going to have to write puppet code/modules to actually get it deployable on beta (possibly falls to Release-Engineering-Team to decide/confirm whether we can make the usage on beta depend on the tool in tool labs for definite) and this is definitely the case on production

Creating a puppet for the server has stared in T151877. It would be good to know if this is a blocker for the review process, in which case we need to prioritize it.

[...] and the current installation instructions is wget-ing various jars from the internet...

# In ''mishkal/tashkeel/tashkeel.py, c''hange line 385 from:
#* <code>vocalized_text = u" ".join([vocalized_text, self.display(word, format_display)])</code>
#* to:
#* <code>vocalized_text = u" ".join([vocalized_text, self.display(voc_word, format_display)])</code>

*definitely* needs fixing upstream. https://github.com/linuxscout/mishkal/issues/17 has been open since April doesn't inspire me with much confidence

These things have been fixed, but the extension page wasn't up to date. I removed the old work arounds and added a link to the installation instructions.

And https://www.mediawiki.org/wiki/Extension:Wikispeech#Start_processes_in_screen isn't going to fly in production... Things aren't going to be manually started in a screen session

This was intended for running the server in a development environment. Added a comment about this in the documentation.

Hi there!
Thanks to @Reedy for doing a quick first pass review of this. It seems this is going to need a fair amount of re-architecting to get to a place where it would be available on either Beta Cluster or production.
I see that this is associated with WMSE; does WMSE have funding for doing the needed work (re-architecture etc)? Looks like there will need to be some time for someone from the TechCom to help out?

Hi @greg! Yes, we will continue to work on the extension also next year. So your feedback is much appreciated. @brion from TechCom has kindly offered us help as well.

Just a general update from the WMSE team. We have only been doing minor work on Wikispeech so far this year as we've no funding for the project. We have now secured some funding for a continuation project so we should be able to work on this again after summer.

Not that we are only a small team working on this (mainly me and @Sebastian_Berlin-WMSE ) and that we are both also involved in other projects at WMSE which demand our time. As a result the pace by which things gets done is not always what we would wish for and it is sensitive to deadlines in the other projects. That said Wikispeech is something WMSE is dedicated to continue so despite the at times slow pace development does continue.