Page MenuHomePhabricator

[16 hours] Investigate: Options of TTS engines
Closed, ResolvedPublicSpike

Description

User Story

We want to be able to understand the capabilities of different text-to-speech engines so that we can design a solution for an IPA audio renderer.

Open source:

Closed source:

Acceptance Criteria

  • Understand the input and output of the engine.
  • Determine if we need to write something or is it just plugin and play.
  • Determine if engine supports Speech Synthesis Markup Language (SSML)

Outcome of this ticket

Create a table that lists out the following facts for each of our options:

  • How many languages does it support and which languages?
  • If it is closed source, how much does it cost?
  • Use the corpus that we have created, and record the audio output of the corpus for that library
  • How many voices does this library have? Is it only one?

Results

StatusStateLink
βœ… DoneCreate table with above informationCommunity Wishlist Survey 2022/Reading/IPA audio renderer/TTS investigation on Meta
βœ… DoneCollate all TTS engines' output for the corpus that we have createdhttps://tnt-dev.toolforge.org/projects/tts
πŸ•™ WIPChoose one open source and one closed source TTS engine for further comparisonπŸ•™

Event Timeline

β€’ JMcLeod_WMF renamed this task from Investigate: Open source IPA engines to [8 hours] Investigate: Open source IPA engines.May 5 2022, 5:39 PM
β€’ JMcLeod_WMF added a project: Spike.
Restricted Application changed the subtype of this task from "Task" to "Spike". Β· View Herald TranscriptMay 5 2022, 5:39 PM
HMonroy updated the task description. (Show Details)
β€’ NRodriguez renamed this task from [8 hours] Investigate: Open source IPA engines to [8 hours] Investigate: Options of IPA engines.May 18 2022, 6:30 PM
β€’ NRodriguez updated the task description. (Show Details)
β€’ NRodriguez updated the task description. (Show Details)
TheresNoTime changed the task status from Open to In Progress.May 19 2022, 12:45 AM
TheresNoTime updated the task description. (Show Details)

FYI, espeak-ng can use the voices provided by the MBROLA engine. From testing briefly these sound better. However, according to the previous link, Voices are cost-free for non-commerical purposes, but are not open source.

FYI, espeak-ng can use the voices provided by the MBROLA engine. From testing briefly these sound better. However, according to the previous link, Voices are cost-free for non-commerical purposes, but are not open source.

Good point.. there's a similar sort of thing for some voices that larynx supports (though even those licences seem pretty "open source", excluding the "Blizzard 2017 Materials" one)

TheresNoTime renamed this task from [8 hours] Investigate: Options of IPA engines to [8 hours] Investigate: Options of TTS engines.May 25 2022, 1:16 PM
β€’ JMcLeod_WMF renamed this task from [8 hours] Investigate: Options of TTS engines to [16 hours] Investigate: Options of TTS engines.Jun 1 2022, 2:40 PM

See report on Meta, defacto chosen TTS engine(s)