Page MenuHomePhabricator

Recording long texts for modelling
Closed, InvalidPublic

Description

Make it possible for users to create their own speech synthesis voices by recording their own voice through a web based interface. The interface prompts the text which is read (which is typically retrieved from Wikipedia) and gives feedback on volume, speech speed etc. To build an understandable HMM speach synthesis voice aprox. 20 min of speech is needed, but more is needed for good quality. Captured speech data can also be used as a basis for a completely free database for speech recognition.
Existing: No
To do: This is not part of MVP and any development happens outside of the scope of this project. Develop the web interface for recording using HTML5 audio and a backend with initial analysis and storage of audio files.

Identified as a component during the pilot study.

Event Timeline

Sebastian_Berlin-WMSE renamed this task from [Task] Recording long texts for modelling (Wikispeech) to Recording long texts for modelling.Oct 28 2019, 1:18 PM

This task was initially created for Wikispeech-Text-to-Speech, but should be reusable for Wikispeech-Speech-Data-Collector. It might need some rewording.