Create an instance on wmflabs in the wikispeech project and install the TTS server manually. This will allow us to figure out the interaction between the TTS sever and the extension, without having to create a puppet role. It will also enable running the demo (T151786) entirely on wmflabs.
[w] Create instance
[x] Install TTS server
[u] Enable communication between TTS server and demo wiki, but disallow external access to the server {icon asterisk}
[ ] Document
{icon asterisk} It turns out that it was not possible to limit access to the wiki, since the requests are sent from the client through javascript.
---
=Documentation=
==Creating an instance==
Follow instructions on https://wikitech.wikimedia.org/wiki/Help:Instances#Creating_an_instance:
# Log in to https://horizon.wikimedia.org and go to the //wikispeech// project (in top bar).
# {nav Compute > Overview} and check that there are instances available.
# Create a Security Group:
** {nav Compute > Access & Security > Create Security Group}, **Name:** TTS-provider
*** {nav Manage Rules > Add Rule}, **Port:** 10000, **Remote:** CIDR, **CIDR:** 0.0.0.0/0
# Open //Launch Instance// dialogue ({nav Compute > Instances > Launch Instance}):
## Details: only set **Instance Name**: wikispeech-tts
## Source: **Select boot source**: Image, under **Available** add //ubuntu-14.04-trusty//
## Flavor: Under **Available** add //m1.medium// (marytts uses ~ 1.1 GB on local machine)
## Security Groups: Under **Available** add //default//, //TTS-provider//, //web-server//
To ssh to the new instance, see: https://wikitech.wikimedia.org/wiki/Help:Getting_Started#Project_Instances
==Install TTS server==
The TTS server consists of three components: MaryTTS (TTS platform), pronlex (a pronunciation lexicon database) and wikispeech_mockup (wikispeech API).
===Install MaryTTS===
# Log into wikispeech-tts
# Install java: `$ sudo apt install openjdk-7-jdk`
# Create a user to run the server: `$ sudo useradd -m tts-agent -p <password>`.
# Become this user: `$ sudo su - tts-agent`
# Clone [[ https://github.com/marytts/marytts-installer | marytts-installer ]] repo: `$ git clone https://github.com/marytts/marytts-installer.git`
# Follow instructions to install English voices
# Download needed STTS-voices from https://github.com/HaraldBerthelsen/marytts/tree/master/stts_voices into `/installed/` (e.g. `voice-stts_sv_nst-hsmm-5.2-SNAPSHOT.jar`)
# Install marrytts-lang-sv:
## Clone the [[ https://github.com/HaraldBerthelsen/marytts | forked maryTTS ]] repo into //stts_marytts//: `$ git clone https://github.com/HaraldBerthelsen/marytts.git stts_marytts`
## Build: `cd stts_marytts; ./gradlew build`
## Copy Swedish language into //marytts-installer//: `$ cp build/install/marytts/lib/marytts-lang-sv-6.0-SNAPSHOT.jar ../marytts-installer/installed/`
## Delete //stts_marytts//: `$ cd ..; rm -R stts_marytts`
===Install pronlex===
====Prerequisites====
This needs to be done with super user access, i.e. not as //tts-agent//.
* Install gcc: `$ sudo apt-get install gcc`
* Install build-essential: `$ sudo apt-get install build-essential`
Following instruction at https://github.com/stts-se/lexdata/wiki/Create-lexicon-database:
# Install go:
## Get the compatible version ([[ https://storage.googleapis.com/golang/go1.7.4.linux-amd64.tar.gz | 1.7.4 ]])
## Install: `$ tar -C /usr/local -xzf go1.7.4.linux-amd64.tar.gz`
## Add `export PATH=$PATH:/usr/local/go/bin` to `.bashrc`
# Install Sqlite3: `$ sudo apt install sqlite3`
# "//Clone the source code//": Follow instructions but use `https://github.com/stts-se/pronlex.git`
# "//Clone the lexdata repository//": Follow instructions but use `https://github.com/stts-se/lexdata.git`
# "//Prepare symbol set files//": Before first step: `$ cd ~go/src/github.com/stts-se/pronlex`
# "//Create an empty database//": Before first step:
## Install go-sqlite3: `go get github.com/mattn/go-sqlite3`
## Install regexp2: `go get github.com/dlclark/regexp2`
# Start server: per instructions
# "//Import lexicon files//": Instruction require GUI so instead
## `$ cd ~/go/src/github.com/stts-se/pronlex; go run lexio/import/import.go lexserver/pronlex.db sv-se.nst ~/gitrepos/lexdata/sv-se/nst/swe030224NST.pron-ws.utf8.gz sv-se_ws-sampa`
# Close
===Install wikispeech_mockup===
Following instruction at https://github.com/stts-se/wikispeech_mockup:
As user with super user access:
# Install the //Prerequisites// but skip the remaining steps
# Install opusenc: `$ sudo apt install opus-tools`.
As //tts-agent//:
# In home directory, clone https://github.com/stts-se/wikispeech_mockup.git: `$ git clone https://github.com/stts-se/wikispeech_mockup.git`
# Create tmp directory in //wikispeech_mockup//: `$ mkdir wikispeech_mockup/tmp`
# Add `host=0.0.0.0` to `app.run()` in //wikispeech.py// (see: http://stackoverflow.com/questions/30554702/cant-connect-to-flask-web-service-connection-refused).
==Make audio files accessible==
# Install apache: `$ sudo apt install apache2`
# Link the audio file directory in /var/www/html: `$ cd /var/www/html; sudo ln -s ~/wikispeech_mockup/tmp/ audio`
# Add proxy: In horizon, {nav DNS > Web Proxies > Create Proxy}: **Hostname:** wikispeech-tts-audio, **Backend Instance:** wikispeech-tts, **Backend port:** 80
Audio files generated by the TTS should now be accessible through: http://wikispeech-tts-audio.wmflabs.org/audio/.
* Disallow access to non-opus files by adding a //.htaccess//-file in //~/wikispeech_mockup/tmp/// with the following content
```
<FilesMatch "\.*$">
Deny from all
</FilesMatch>
<FilesMatch "\.opus$">
Order deny,allow
Allow from all
</FilesMatch>
```
* Remove directory listing by adding a //index.html// file in //~/wikispeech_mockup/tmp/// with something like the following content:
`This is a service to serve Wikispeech audio files. There is really never any reason to see this page.`
* To also display this page in the root directory: `$ cd /var/www/html/; sudo rm index.html; sudo ln -s audio/index.html`
Note that adding `Options -Indexes` to the top of //.htaccess// did not prevent the directory listing.
==Update the server with the new path:
# In //~/wikispeech_mockup/wikispeech.py//, change:
** In `synthesise()`
```
- audio_url = "%s/wikispeech_mockup/%s" % (hostname,opus_audio)
+ audio_url = "//wikispeech-tts-audio.wmflabs.org/audio/%s" % (opus_audio)
```
** In `saveAndConvertAudio()`
```
- opus_url_suffix = re.sub("^.*/%s/" % tmpdir, "%s/" % tmpdir, tmpopus)
+ opus_url_suffix = re.sub("^.*/%s/" % tmpdir, "", tmpopus)
```