- Update the Docker images for all Speecoid components. Some may not have changes since last version, so they won't need it.
- pronlex - tests run.
- symbolset - tests run.
- wikispeech-server
- Add new images
- Matcha
- Piper
- Textprocessor
- Update speechoid-docker-compose/ in the Wikispeech repo.
- Retire wikispeech-speechoid-docker-compose on Github in favour for 2.
Description
Details
| Status | Subtype | Assigned | Task | ||
|---|---|---|---|---|---|
| Stalled | None | T351223 Build new Speechoid release | |||
| Open | None | T360881 Deploy Speechoid 0.1.3 on producer wiki |
Event Timeline
It looks like there are other Docker images that works better and successfully build. Openjdk-11-jdk is probably the one we want to use.
I've managed to build all the images in the image creation script and all except symbolset, which quits after a a bit. Not sure why, the log looks fine:
$ docker compose logs symbolset speechoid-symbolset-1 | Starting Symbolset... speechoid-symbolset-1 | 2023/12/07 11:46:17 server: created logger for stderr speechoid-symbolset-1 | 2023/12/07 11:46:17 server: loading mapper definitions from file ss_files/mappers.txt speechoid-symbolset-1 | 2023/12/07 11:46:17 server: initializing mapper {sv-se_ws-sampa-DEMO sv-se_sampa_mary-DEMO [{" au . l a ' a*U - l a}]} speechoid-symbolset-1 | 2023/12/07 11:46:17 server: initializing mapper {sv-se_sampa_mary-DEMO sv-se_ws-sampa-DEMO [{' a*U - l a " au . l a}]} speechoid-symbolset-1 | 2023/12/07 11:46:17 server: initializing mapper {sv-se_ws-sampa sv-se_sampa_mary [{" au . l a ' a*U - l a}]} speechoid-symbolset-1 | 2023/12/07 11:46:17 server: initializing mapper {sv-se_sampa_mary sv-se_ws-sampa [{' a*U - l a " au . l a}]} speechoid-symbolset-1 | 2023/12/07 11:46:17 server: loaded symbol sets from dir ss_files speechoid-symbolset-1 | 2023/12/07 11:46:17 server: loaded converter enusampa_svsampa-DEMO speechoid-symbolset-1 | 2023/12/07 11:46:17 server: loaded converter enusampa_svsampa speechoid-symbolset-1 | 2023/12/07 11:46:17 server: loaded converters from dir ss_files speechoid-symbolset-1 | 2023/12/07 11:46:17 Server started on 0.0.0.0:8771 speechoid-symbolset-1 | 2023/12/07 11:46:17 server: server up and running using port 8771 speechoid-symbolset-1 | Test successful!
For some reason the test variant is run for symbolset, which is why it exits when it's done (I think). Not sure why this happens for just this image; the others seem to get the correct production variant.
After having poked around a bit and recreated the image for symbolset, it now works. I'm not sure what has really changed😕
It may not be part of this task, but there are a few of components that aren't required for Speechoid to run, such as AhoTTS and Sox proxy. It looks like these could be made optional using profiles.
wikispeech-server was updated in December and the new version is in the Docker registry. The rest of the components may have had changes since last Speechoid version, though I doubt all of them have. I'll have a look and update the Docker images for any that need, should just require merging upstream.
AhoTts didn't build after merging the latest changes. After changing the Python version (from 2 to 3) and adding Pip it builds. I still need to test that it actually works as it should.
Change 1003441 had a related patch set uploaded (by Sebastian Berlin (WMSE); author: Sebastian Berlin (WMSE)):
[mediawiki/services/wikispeech/ahotts@master] Merge remote-tracking branch 'gerrit/upstream' into 2024-02-12-merge-github
The tests for AhoTTS doesn't run with Python 3, but does with Python 2. We really shouldn't be using Python 2 since it's been retired for about four years, so I'll have to do some more digging.
I have done some detective work now and found the following:
- Pip needs to be added to installed packages.
- The test in blubber-entrypoint-test.sh checks if a GET request to the server works without any parameters or data. In actual use, it looks like there are two requests to synthesise an utterances: POST to /ahotts_getaudio to create the audio file and then GET to /ahotts_downloadfile to retrieve it.
- In ahotts-httpserver.py AhoTTS_HTTPServer.do_POST the data is parsed as a query string. Somehow this worked with Python 2's urlparse, but not with 3's. There is also some transformation of the parameters for Ptyhon 3 that breaks them.
After hacking around these issues I managed to get the Blubber test to run, which is required by the pipeline (see failures in patch in T351223#9542662). I'm not confident that there won't be any more trouble when actually running it, though.
Apart from switching the base Docker image to openjdk-11-jdk MaryTTS looks like it's running fine.
From the readme in the upstream repo for Mishkal it looks like Python 3 should work, but it doesn't for me.
Pronlex tests run after upstream merge. I also update the base Docker image to docker-registry.wikimedia.org/golang1.21. This means I could also clean out some things that downloaded and set up Go in the Blubber scripts.
I managed to get everything working after keeping the old Docker images for AhoTTS and Mishkal. I don't plan to update them since I couldn't get them to run properly, so they won't get the generic Blubber updates like the other images.
I also noticed/remembered that wikispeech-server-sox-proxy isn't included in the Docker Compose file. Rather, it is, but it's out commented. There's no Docker image for it in the registry, so that needs to be added. Currently it's hacked into the producer server as a local image.
I saw that it's recommended to use Gitlab (as opposed to Gerrit) for getting Docker images into the registry. I'm going to give that a try for wikispeech-server-sox-proxy.
Change 1003441 abandoned by Sebastian Berlin (WMSE):
[mediawiki/services/wikispeech/ahotts@master] Merge remote-tracking branch 'gerrit/upstream' into 2024-02-12-merge-github
Reason:
I couldn't get this version running
Docker images for mary-tts, pronlex, symbolset and wikispeech-server have been updated in the registry.
We now have a group in Gitlab for Speechoid. I imported sox-proxy and am waiting for it to get added as a trusted project so it can be published via CI.
First image for Sox proxy is now up on the registry. A readme should be added describing how to make new versions of it, similar to README-merge-from-github-upstream.md in other repos.
Change #1014008 had a related patch set uploaded (by Sebastian Berlin (WMSE); author: Sebastian Berlin (WMSE)):
[mediawiki/extensions/Wikispeech@master] New version of Speechoid, 0.1.3
Two follow up tasks:
- Documentation described in T351223#9637152
- mediawiki/services/wikispeechis called (repos/)mediawiki/services/speechoidon GitLab which needs to be considered in the migration - T360758
Change #1014008 merged by Sebastian Berlin (WMSE):
[mediawiki/extensions/Wikispeech@master] New version of Speechoid, 0.1.3
I think this one supersedes that task. We can probably close it as duplicate, but maybe wait until T386191: Go through and clean out old #Wikispeech-Text-to-Speech tasks is done.
We ran into som issues in the publishing stage. Both Piper and Match use the same base image (docker-registry.wikimedia.org/python3-trixie:0.0.1-20260315) and both fail. Here's an example of Matcha failing :https://gitlab.wikimedia.org/repos/mediawiki/services/speechoid/matcha/-/jobs/769749. The first relevant line looks to be:
#11 38.04 Cannot initiate the connection to mirrors.wikimedia.org:80 (2620:0:861:2:208:80:154:139). - connect (101: Network is unreachable)
I tried changing the base image to docker-registry.wikimedia.org/python3-bookworm:0.0.3-20260315. Now the image builds, but is apparently too big for the registry (full log):
#16 ERROR: failed to push docker-registry.discovery.wmnet/repos/mediawiki/services/speechoid/matcha:test-b7e4a32-2: unexpected status from PUT request to https://docker-registry.discovery.wmnet/v2/repos/mediawiki/services/speechoid/matcha/blobs/uploads/4a4a2870-317e-488a-b690-0b1ad0df5e6f?_state=dRTv1hGEDjdDxLG137Ror4yBh7Zypm_aMrLUmoM5p0p7Ik5hbWUiOiJyZXBvcy9tZWRpYXdpa2kvc2VydmljZXMvc3BlZWNob2lkL21hdGNoYSIsIlVVSUQiOiI0YTRhMjg3MC0zMTdlLTQ4OGEtYjY5MC0wYjFhZDBkZjVlNmYiLCJPZmZzZXQiOjAsIlN0YXJ0ZWRBdCI6IjIwMjYtMDMtMThUMTQ6MzA6NTUuODY2NzYzOTY4WiJ9&digest=sha256%3A775aacc6921fcdd960d944186eb407fb8f786bc09d55d72c102a569b709d8c05: 413 Request Entity Too Large: <html> <head><title>413 Request Entity Too Large</title></head> <body> <center><h1>413 Request Entity Too Large</h1></center> <hr><center>nginx/1.22.1</center> </body> </html>
I tried disabling Pip's cache (source), but that didn't change the image size much, about 100 MB from 9.27GB.
A (temporary) alternative could be to stick the Blubber files directly into the Docker Compose file. This way you need to build it yourself instead of just downloading. At least for testing it should work though. I did a quick test that didn't quite work, but I've probably just missed something.
The new release will be done after testing is complete, see T421713: Build new Speechoid MVP.
Change #1266822 had a related patch set uploaded (by Viktoria Hillerud WMSE; author: Viktoria Hillerud WMSE):
[mediawiki/extensions/Wikispeech@master] Build new Speechoid release
A new version of the Matcha repo decreased the size down to 2.75GB. The build then went through and the image can be found on docker-registry.wikimedia.org/repos/mediawiki/services/speechoid/matcha:test-873338b.
Change #1266822 abandoned by Viktoria Hillerud WMSE:
[mediawiki/extensions/Wikispeech@master] Build new Speechoid release
Reason:
Duplicate of https://gerrit.wikimedia.org/r/c/mediawiki/extensions/Wikispeech/+/1270867
Now when the iteration and testing of STTS latest components, updates and fixes are done, resulting in the branch "mvp_may_2026", we can build a new Speechoid release.