Page MenuHomePhabricator

Use MP3 instead of WAV for all engines except Larnyx
Closed, ResolvedPublic5 Estimated Story Points

Description

Per T314789#8140256, we'll want to store/serve only MP3 in Phonos rather than WAV. This format is widely supported and has a smaller storage footprint than lossless WAV.

Google already offers MP3 output. For the other two engines, we'll apparently need to use Lame (already installed on prod) to convert to MP3.

Acceptance criteria

  • The user should only be served audio in MP3 format
  • (Implementation detail) Ideally only one shell command is ran, to prevent unnecessary overhead of communicating with the remote Shellbox server
  • This is only for the Google and eSpeak engines. Larnyx has apparently has a unique issue, and that's being tracked at T319242.

Event Timeline

Change 824300 had a related patch set uploaded (by MusikAnimal; author: MusikAnimal):

[mediawiki/extensions/Phonos@master] Use MP3 instead of WAV

https://gerrit.wikimedia.org/r/824300

MusikAnimal set the point value for this task to 5.

(Implementation detail) Ideally only one shell command is ran, to prevent unnecessary overhead of communicating with the remote Shellbox server

I spent a lot of time trying to do this, using a shell script so that we can bundle the eSpeak and Lame commands into one Shellbox call. I hit a wall and realized, considering we're never going to use eSpeak in production, it just wasn't worth the effort. Anyway that's what the 5 points are for! Otherwise it's just a 3.

Just out of interest, there's quite a bit of file size difference between the MP3 rendering of "hello" and the WAV rendering, and all of our storage calculations have been based off of WAV files.. :)

MP3

  • Duration: 0.816s
  • MIME type: audio/mpeg
  • Extension: mp3
  • Size: 3.19 KB

WAV

  • Duration: 0.754083s
  • MIME type: audio/wav
  • Extension: wav
  • Size: 35.39 KB

Change 824300 merged by jenkins-bot:

[mediawiki/extensions/Phonos@master] Use MP3 instead of WAV

https://gerrit.wikimedia.org/r/824300

Change 825932 had a related patch set uploaded (by MusikAnimal; author: MusikAnimal):

[mediawiki/extensions/Phonos@master] Check status code after converting to MP3

https://gerrit.wikimedia.org/r/825932

Change 825932 merged by jenkins-bot:

[mediawiki/extensions/Phonos@master] Check status code after converting to MP3

https://gerrit.wikimedia.org/r/825932

@MusikAnimal I cannot get this working with Larynx locally. The logs show:

[exec] Executing: /bin/bash '/var/www/html/w/vendor/wikimedia/shellbox/src/Command/limit.sh' ''\''/usr/bin/lame'\'' '\''-'\'' '\''-'\''' 'SB_INCLUDE_STDERR=;SB_CPU_LIMIT=180; SB_CGROUP='\'''\''; SB_MEM_LIMIT=1073741824; SB_FILE_SIZE_LIMIT=104857600; SB_WALL_CLOCK_LIMIT=180; SB_USE_LOG_PIPE=yes'
[exec] Error running /bin/bash '/var/www/html/w/vendor/wikimedia/shellbox/src/Command/limit.sh' ''\''/usr/bin/lame'\'' '\''-'\'' '\''-'\''' 'SB_INCLUDE_STDERR=;SB_CPU_LIMIT=180; SB_CGROUP='\'''\''; SB_MEM_LIMIT=1073741824; SB_FILE_SIZE_LIMIT=104857600; SB_WALL_CLOCK_LIMIT=180; SB_USE_LOG_PIPE=yes': LAME 3.100 64bits (http://lame.sf.net)
Using polyphase lowpass filter, transition band:  8269 Hz -  8535 Hz
Encoding <stdin> to <stdout>
Encoding as 22.05 kHz single-ch MPEG-2 Layer III (11x)  32 kbps qval=3

#0 /var/www/html/w/vendor/wikimedia/shellbox/src/Command/LocalBoxedExecutor.php(41): Shellbox\Command\UnboxedExecutor->execute(Shellbox\Command\BoxedCommand)
#1 /var/www/html/w/vendor/wikimedia/shellbox/src/Command/BoxedExecutor.php(20): Shellbox\Command\LocalBoxedExecutor->executeValid(Shellbox\Command\BoxedCommand)
#2 /var/www/html/w/vendor/wikimedia/shellbox/src/Command/BoxedCommand.php(183): Shellbox\Command\BoxedExecutor->execute(Shellbox\Command\BoxedCommand)
#3 /var/www/html/w/extensions/Phonos/includes/Engine/Engine.php(155): Shellbox\Command\BoxedCommand->execute()
#4 /var/www/html/w/extensions/Phonos/includes/Engine/LarynxEngine.php(73): MediaWiki\Extension\Phonos\Engine\Engine->convertWavToMp3(string)
#5 /var/www/html/w/extensions/Phonos/includes/PhonosApi.php(37): MediaWiki\Extension\Phonos\Engine\LarynxEngine->getAudioData(string, string, string)
#6 /var/www/html/w/includes/api/ApiMain.php(1900): MediaWiki\Extension\Phonos\PhonosApi->execute()
#7 /var/www/html/w/includes/api/ApiMain.php(875): ApiMain->executeAction()
#8 /var/www/html/w/includes/api/ApiMain.php(846): ApiMain->executeActionWithErrorHandling()
#9 /var/www/html/w/api.php(90): ApiMain->execute()
#10 /var/www/html/w/api.php(45): wfApiMain()
#11 {main}
[exec] Removed directory "/tmp/shellbox-bbbc24295b96bcd5"
[exec] Creating base path /tmp/shellbox-9eb94d3ba96784d5
[exec] Executing: /bin/bash '/var/www/html/w/vendor/wikimedia/shellbox/src/Command/limit.sh' ''\''/usr/bin/lame'\'' '\''-'\'' '\''-'\''' 'SB_INCLUDE_STDERR=;SB_CPU_LIMIT=180; SB_CGROUP='\'''\''; SB_MEM_LIMIT=1073741824; SB_FILE_SIZE_LIMIT=104857600; SB_WALL_CLOCK_LIMIT=180; SB_USE_LOG_PIPE=yes'
[exec] Error running /bin/bash '/var/www/html/w/vendor/wikimedia/shellbox/src/Command/limit.sh' ''\''/usr/bin/lame'\'' '\''-'\'' '\''-'\''' 'SB_INCLUDE_STDERR=;SB_CPU_LIMIT=180; SB_CGROUP='\'''\''; SB_MEM_LIMIT=1073741824; SB_FILE_SIZE_LIMIT=104857600; SB_WALL_CLOCK_LIMIT=180; SB_USE_LOG_PIPE=yes': Warning: unsupported audio format
Can't init infile '-'

#0 /var/www/html/w/vendor/wikimedia/shellbox/src/Command/LocalBoxedExecutor.php(41): Shellbox\Command\UnboxedExecutor->execute(Shellbox\Command\BoxedCommand)
#1 /var/www/html/w/vendor/wikimedia/shellbox/src/Command/BoxedExecutor.php(20): Shellbox\Command\LocalBoxedExecutor->executeValid(Shellbox\Command\BoxedCommand)
#2 /var/www/html/w/vendor/wikimedia/shellbox/src/Command/BoxedCommand.php(183): Shellbox\Command\BoxedExecutor->execute(Shellbox\Command\BoxedCommand)
#3 /var/www/html/w/extensions/Phonos/includes/Engine/Engine.php(155): Shellbox\Command\BoxedCommand->execute()
#4 /var/www/html/w/extensions/Phonos/includes/Engine/LarynxEngine.php(74): MediaWiki\Extension\Phonos\Engine\Engine->convertWavToMp3(string)
#5 /var/www/html/w/extensions/Phonos/includes/PhonosApi.php(37): MediaWiki\Extension\Phonos\Engine\LarynxEngine->getAudioData(string, string, string)
#6 /var/www/html/w/includes/api/ApiMain.php(1900): MediaWiki\Extension\Phonos\PhonosApi->execute()
#7 /var/www/html/w/includes/api/ApiMain.php(875): ApiMain->executeAction()
#8 /var/www/html/w/includes/api/ApiMain.php(846): ApiMain->executeActionWithErrorHandling()
#9 /var/www/html/w/api.php(90): ApiMain->execute()
#10 /var/www/html/w/api.php(45): wfApiMain()
#11 {main}

Change 827004 had a related patch set uploaded (by MusikAnimal; author: MusikAnimal):

[mediawiki/extensions/Phonos@master] LarnyxEngine: remove double-conversion of WAV to MP3

https://gerrit.wikimedia.org/r/827004

I cannot get this working with Larynx locally …

I was getting a bunch of different errors, but in the process I did find a glaring bug and have a patch up now to fix it. Things seem to work reliably for me now, so hopefully it fixes your issue too.

Warning: unsupported audio format
Can't init infile '-'

This might suggest you have a different and/or unsupported version of Lame, maybe? After the patch is merged (or you can try applying it yourself locally), maybe things will work. Like I said, I wasn't able to repro the errors you saw. I did forget to mention you needed to install Lame on your machine first, but it sounds like you figured that out already. I'm using Lame 64bits version 3.100 if that helps, which was the latest MediaWiki-Docker had available after apt update && apt install lame.

Change 827004 merged by jenkins-bot:

[mediawiki/extensions/Phonos@master] LarnyxEngine: remove double-conversion of WAV to MP3

https://gerrit.wikimedia.org/r/827004

I cannot get the larynx engine to produce mp3 files. Instead, it produces WAV files with an .mp3 suffix.

I have no idea what is going on. I have tried with docker on two different computers and on bare metal.

Here is the error I get in the logs, but it does not tell me anything :(.

[phonos] FileBackendStore::ingestFreshFileStats: File mwstore://phonos-backend/phonos/339e67901330e62de0457b2baa2245a0.mp3 does not exist
[objectcache] getWithSetCallback(global:rdbms-server-readonly:localhost:my_wiki:): process cache hit
[exec] Creating base path /tmp/shellbox-6ce2d56f3339c6d9
[exec] Executing: /bin/bash '/var/www/html/w/vendor/wikimedia/shellbox/src/Command/limit.sh' ''\''/usr/bin/lame'\'' '\''-'\'' '\''-'\''' 'SB_INCLUDE_STDERR=;SB_CPU_LIMIT=180; SB_CGROUP='\'''\''; SB_MEM_LIMIT=1073741824; SB_FILE_SIZE_LIMIT=104857600; SB_WALL_CLOCK_LIMIT=180; SB_USE_LOG_PIPE=yes'
[exec] Error running /bin/bash '/var/www/html/w/vendor/wikimedia/shellbox/src/Command/limit.sh' ''\''/usr/bin/lame'\'' '\''-'\'' '\''-'\''' 'SB_INCLUDE_STDERR=;SB_CPU_LIMIT=180; SB_CGROUP='\'''\''; SB_MEM_LIMIT=1073741824; SB_FILE_SIZE_LIMIT=104857600; SB_WALL_CLOCK_LIMIT=180; SB_USE_LOG_PIPE=yes': LAME 3.100 64bits (http://lame.sf.net)
Using polyphase lowpass filter, transition band:  8269 Hz -  8535 Hz
Encoding <stdin> to <stdout>
Encoding as 22.05 kHz single-ch MPEG-2 Layer III (11x)  32 kbps qval=3

#0 /var/www/html/w/vendor/wikimedia/shellbox/src/Command/LocalBoxedExecutor.php(41): Shellbox\Command\UnboxedExecutor->execute()
#1 /var/www/html/w/vendor/wikimedia/shellbox/src/Command/BoxedExecutor.php(20): Shellbox\Command\LocalBoxedExecutor->executeValid()
#2 /var/www/html/w/vendor/wikimedia/shellbox/src/Command/BoxedCommand.php(183): Shellbox\Command\BoxedExecutor->execute()
#3 /var/www/html/w/extensions/Phonos/includes/Engine/Engine.php(203): Shellbox\Command\BoxedCommand->execute()
#4 /var/www/html/w/extensions/Phonos/includes/Engine/LarynxEngine.php(73): MediaWiki\Extension\Phonos\Engine\Engine->convertWavToMp3()
#5 /var/www/html/w/extensions/Phonos/includes/Engine/Engine.php(110): MediaWiki\Extension\Phonos\Engine\LarynxEngine->getAudioData()
#6 /var/www/html/w/extensions/Phonos/includes/Phonos.php(103): MediaWiki\Extension\Phonos\Engine\Engine->getAudioUrl()
#7 /var/www/html/w/includes/parser/Parser.php(3442): MediaWiki\Extension\Phonos\Phonos->renderPhonos()
#8 /var/www/html/w/includes/parser/Parser.php(3125): Parser->callParserFunction()
#9 /var/www/html/w/includes/parser/PPFrame_Hash.php(276): Parser->braceSubstitution()
#10 /var/www/html/w/includes/parser/Parser.php(2954): PPFrame_Hash->expand()
#11 /var/www/html/w/includes/parser/Parser.php(1609): Parser->replaceVariables()
#12 /var/www/html/w/includes/parser/Parser.php(723): Parser->internalParse()
#13 /var/www/html/w/includes/content/WikitextContentHandler.php(301): Parser->parse()
#14 /var/www/html/w/includes/content/ContentHandler.php(1721): WikitextContentHandler->fillParserOutput()
#15 /var/www/html/w/includes/content/Renderer/ContentRenderer.php(47): ContentHandler->getParserOutput()
#16 /var/www/html/w/includes/Revision/RenderedRevision.php(266): MediaWiki\Content\Renderer\ContentRenderer->getParserOutput()
#17 /var/www/html/w/includes/Revision/RenderedRevision.php(237): MediaWiki\Revision\RenderedRevision->getSlotParserOutputUncached()
#18 /var/www/html/w/includes/Revision/RevisionRenderer.php(221): MediaWiki\Revision\RenderedRevision->getSlotParserOutput()
#19 /var/www/html/w/includes/Revision/RevisionRenderer.php(158): MediaWiki\Revision\RevisionRenderer->combineSlotOutput()
#20 [internal function]: MediaWiki\Revision\RevisionRenderer->MediaWiki\Revision\{closure}()
#21 /var/www/html/w/includes/Revision/RenderedRevision.php(199): call_user_func()
#22 /var/www/html/w/includes/poolcounter/PoolWorkArticleView.php(91): MediaWiki\Revision\RenderedRevision->getRevisionParserOutput()
#23 /var/www/html/w/includes/poolcounter/PoolWorkArticleViewCurrent.php(97): PoolWorkArticleView->renderRevision()
#24 /var/www/html/w/includes/poolcounter/PoolCounterWork.php(162): PoolWorkArticleViewCurrent->doWork()
#25 /var/www/html/w/includes/page/ParserOutputAccess.php(299): PoolCounterWork->execute()
#26 /var/www/html/w/includes/page/Article.php(713): MediaWiki\Page\ParserOutputAccess->getParserOutput()
#27 /var/www/html/w/includes/page/Article.php(528): Article->generateContentOutput()
#28 /var/www/html/w/includes/actions/ViewAction.php(78): Article->view()
#29 /var/www/html/w/includes/MediaWiki.php(542): ViewAction->show()
#30 /var/www/html/w/includes/MediaWiki.php(322): MediaWiki->performAction()
#31 /var/www/html/w/includes/MediaWiki.php(904): MediaWiki->performRequest()
#32 /var/www/html/w/includes/MediaWiki.php(562): MediaWiki->main()
#33 /var/www/html/w/index.php(50): MediaWiki->run()
#34 /var/www/html/w/index.php(46): wfIndexMain()
#35 {main}
Testing notes
  1. Save some {{#phonos}} template to an article (e.g. {{#phonos:ipa=ˌæpəˈlætʃə|text=Appalachia}})
  2. Right click on the IPA link and "Save link as..." (or equivalent for your browser) and save it locally
  3. Check that the type of the file is MP3

Ideally, repeat this for IPA engines google, larynx and espeak (but espeak is low priority).

I think I noted this somewhere, but in my testing it worked, but Larynx took significantly longer to generate a WAV file..

You're right! It's totally just giving the WAV file an .mp3 extension. I'm investigating now. I have confirmed (I think) that eSpeak is properly being converted to MP3, so it works in some cases... Anyways, back to 'In Development' we shall go!

I have created T319242: Larnyx engine gives a WAV file with a MP3 extension for the outstanding issue with Larnyx. Since we've confirmed all is well with the Google engine (which is what we'll use in production), I will go ahead and move this to Product Sign-off. Also re-titling the task to reflect this doesn't include the Larnyx engine.

MusikAnimal renamed this task from Use MP3 instead of WAV to Use MP3 instead of WAV for all engines except Larnyx.Oct 3 2022, 9:11 PM
MusikAnimal updated the task description. (Show Details)

This one is on the more technical side so hard to do user-facing QA but thanks for your notes @Domwalden

Save some {MediaWiki-extensions-Phonos} template to an article (e.g. {{MediaWiki-extensions-Phonos:ipa=ˌæpəˈlætʃə|text=Appalachia}})
Right click on the IPA link and "Save link as..." (or equivalent for your browser) and save it locally
Check that the type of the file is MP3