PronunciationRecording currently use UploadWizard's upload code. However, PronunciationRecording workarounds (and a couple hacks) to deal with the fact that that upload code is not fully modular (it reaches into other parts of UploadWizard).
This should be cleaned up so that the dependencies make sense. Part of this will have to be on the UploadWizard side (https://gerrit.wikimedia.org/r/#/c/84344/).
After that, PronunciationRecording will need to be adapted accordingly, fixing this bug.
In the future, it would be good to have the client upload code in core, but that is not in the scope of this bug.
T51988: Use Deferred/Promise and generally async-friendly code paths instead of jquery.pubsub
T64513: Create mw.Api.plugin.upload for uploading from MediaWiki frontend code