Page MenuHomePhabricator

Connect API to utterance store
Closed, ResolvedPublic3 Estimated Story Points

Description

Client API should request utterance audio and synthesis metadata from UtteranceStore rather than directly communicating with Speechoid. This includes switching from accepting text input from client (although we might keep that feature for other reasons) to accepting segment hashes and page identity as client API arguments.

Event Timeline

Change 596676 had a related patch set uploaded (by Karl Wettin (WMSE); owner: Karl Wettin (WMSE)):
[mediawiki/extensions/Wikispeech@master] Create database for utterance data

https://gerrit.wikimedia.org/r/596676

kalle added a subscriber: kalle.

I didn't mean to, but I managed to implement this, T248469 and T251261 all at the same time in the same patch.

kalle moved this task from 🥴 Backlog to 🤠 This week on the User-kalle board.
kalle added a comment.May 17 2020, 5:22 PM

To run swift on local machine via Docker try this:

kalle@musa:~$ docker run -v /srv --name SWIFT_DATA busybox
Unable to find image 'busybox:latest' locally
latest: Pulling from library/busybox
d9cbbca60e5f: Pull complete
Digest: sha256:836945da1f3afe2cfff376d379852bbb82e0237cb2925d53a13f53d6e8a8c48c
Status: Downloaded newer image for busybox:latest
kalle@musa:~$ ID=$(docker run -d -p 12345:8080 --volumes-from SWIFT_DATA -t morrisjobke/docker-swift-onlyone)

It can be reached from Vagrant on 10.11.12.1:12345

		"WikispeechUtteranceFileStore": {
			"description": [
				"Connection details for utterance audio and metadata store.",
				"Set type to 'Swift', 'FileSystem' or 'RAM'."
			],
			"value": {
				"type": "Swift",
				"swiftAuthUrl": "http://10.11.12.1:12345/auth/v1.0",
				"swiftUser": "test:tester",
				"swiftKey": "testing",
				"fileSystemBasePath": "/tmp/wikispeech_utterances"
			}
		}
kalle added a comment.May 17 2020, 6:07 PM

This patch doesn't actually do anything with the API, i.e. where data is retrieved from Speechoid and passed down to the user as we (read: I) haven't upgraded the dev-server.

It does however contain all preparations to retrieve and store the data from FileBackend in the new class Utterances.

class Utterances {

...

	public function createUtterance( $page_id, $language, $voice, $segment_hash,
									 $audioBase64, $audioMetadata ) {

...

	public function findUtterance( $page_id, $language, $voice, $seg_hash ) {

We do not want to work with base64 encoded strings for audio data. I'm just not sure how to handle this is raw binary data in PHP. It should however be a quick fix to switch to that, i.e. rename parameters and ensure it fits in the content-string passed down to FileBackend. One alternative to the content-string is to write a local file and use the copy-local-file-operation in FileBackend.

Change 596676 merged by jenkins-bot:
[mediawiki/extensions/Wikispeech@master] Create database for utterance data

https://gerrit.wikimedia.org/r/596676

Change 607285 had a related patch set uploaded (by Karl Wettin (WMSE); owner: Karl Wettin (WMSE)):
[mediawiki/extensions/Wikispeech@master] Connect API to utterance store

https://gerrit.wikimedia.org/r/607285

kalle changed the point value for this task from 16 to 3.Jul 9 2020, 8:54 AM
kalle renamed this task from Store Speechoid response as files to Connect API to utterance store.Jul 20 2020, 8:19 AM
kalle updated the task description. (Show Details)
kalle added a comment.Jul 20 2020, 8:21 AM

I updated the task as previous title and description was a comment on the code implemented in the UtteranceStore, https://gerrit.wikimedia.org/r/c/596676, which is already merged to master.

Store data from Speechoid response as files. Use [[ https://doc.wikimedia.org/mediawiki-core/master/php/classFileBackend.html | FileBackend ]] for this. This should work with audio as data in the response (T246087).

The files to store are:
# The audio
# Related data, currently tokens, as JSON

Change 607285 merged by jenkins-bot:
[mediawiki/extensions/Wikispeech@master] Connect API to utterance store

https://gerrit.wikimedia.org/r/607285

Jopparn closed this task as Resolved.Jul 27 2020, 10:06 AM