Page MenuHomePhabricator

Officially support languages the TTS does not support IPA for
Open, Needs TriagePublic

Description

The Google TTS supports IPA as input for

  • Dutch, English, Filipino, French, German, Hindi, Indonesian, Italian, Korean, Polish, Russian, Spanish, Turkish

but not for

  • Afrikaans, Arabic, Basque, Bengali, Bulgarian, Cantonese, Catalan, Czech, Danish, Finnish, Galician, Greek, Gujarati, Hebrew, Hungarian, Icelandic, Japanese, Kannada, Latvian, Lithuanian, Malay, Malayalam, Mandarin, Marathi, Norwegian, Brazilian Portuguese, European Portuguese, Punjabi, Romanian, Serbian, Slovak, Swedish, Tamil, Telugu, Thai, Ukrainian, Vietnamese

Currently specifying one of the latter in <phonos> results in ipa= being ignored. Officially support them by making ipa= optional.

I think we should definitely support whatever the engine supports.

Event Timeline

Restricted Application added a subscriber: Aklapper. · View Herald Transcript

As it stands, the ipa parameter is currently optional iff;

  • the file parameter is set or,
  • the wikibase parameter is set

Are you asking that the ipa parameter become optional if the text parameter is set (in addition to the above)?

Yes.

TTS support for text only in phonos should fail with an error

A user can make IPA optional in a sense by using an IPA character not supported by Google, as it will ignore the unsupported IPA and just read the text.

For example:

<phonos ipa="#" text="blahblah" lang="en" />

It will ignore the # and just read blahblah.

The idea is to make it possible without a hack like this.

Yes.

TTS support for text only in phonos should fail with an error

A user can make IPA optional in a sense by using an IPA character not supported by Google, as it will ignore the unsupported IPA and just read the text.

For example:

<phonos ipa="#" text="blahblah" lang="en" />

It will ignore the # and just read blahblah.

The idea is to make it possible without a hack like this.

This is something that needs to be addressed because we don't have a way to validate IPA.
Product-wise, TTS support is not in the roadmap so I'm gonna close this task.

Feel free to change the title and reopen, or create a new one requesting TTS support as a separate feature, which I think it would be a valuable thing on its own, just not in the scope of this project at the moment.

@dmaza So let me get this straight. You're saying that Phonos is going to stop supporting

  • Afrikaans, Arabic, Basque, Bengali, Bulgarian, Cantonese, Catalan, Czech, Danish, Finnish, Galician, Greek, Gujarati, Hebrew, Hungarian, Icelandic, Japanese, Kannada, Latvian, Lithuanian, Malay, Malayalam, Mandarin, Marathi, Norwegian, Brazilian Portuguese, European Portuguese, Punjabi, Romanian, Serbian, Slovak, Swedish, Tamil, Telugu, Thai, Ukrainian, Vietnamese

which it currently supports, and will support only

  • Dutch, English, Filipino, French, German, Hindi, Indonesian, Italian, Korean, Polish, Russian, Spanish, Turkish

even though at T320523#8453284 @Samwilson said "I think we should definitely support whatever the engine supports", and despite the fact many languages have "shallow" orthographies so orthographic text is just as reliably indicative of pronunciation as IPA?

I mean, that's a choice, but I wonder how the Wikimedia community will receive a feature that only supports a handful of mostly western European languages when supporting 37 other languages, many from the Global South, comes at no extra cost (or at less cost, as not supporting them requires you to actively exclude them in code).

The CWS voters weren't asking for a tool that generated audio from IPA as editors. They were asking for a feature that made trying to read IPA redundant as readers. It doesn't make a lick of difference whether the audio is actually generated from IPA as input. Even if it didn't support any IPA input it would still be a satisfactory solution. Wikis have been using IPA to express pronunciation precisely because something like Phonos wasn't available. To exclude languages the TTS happens to not support IPA for just because the ostensible task was to "generate audio for IPA" is absurd.

I mean, what are you even rolling it out on Afrikaans Wiktionary and Arabic Wikipedia for if you're not going to support Afrikaans or Arabic?! The fact you chose them as pilot wikis tells me you weren't thinking this through. As long as you're using Google and don't intend to exclude dozens of widely spoken and minoritized languages, you have to make IPA optional.

Nardog renamed this task from Make IPA optional to Officially support languages the TTS does not support IPA for.Mar 14 2023, 10:04 PM
Nardog reopened this task as Open.
Nardog updated the task description. (Show Details)

The CWS voters weren't asking for a tool that generated audio from IPA as editors. They were asking for a feature that made trying to read IPA redundant as readers.

I am the author of the original CWS 'wish', which was explicitly for "A tool or gadget that takes IPA as input and outputs an audio file or stream.". The quote above not represent my views, nor any expressed in comments supporting the wish.

The Phabricator ticket which I opened and which led to that wish, T298950, is equally explicit in its intention: "We should have a tool that makes IPA transcriptions (e.g. /ˈbɜːrmɪŋəm/ for "Birmingham") hearable as audio." as is an earlier, ticket, T33221: "...using the existing IPA in each page (in Wiktionary, and some in Wikipedia and elsewhere) to auto-generate a sound file...". None of these refer to wanting a text-to speech renderer.

It doesn't make a lick of difference whether the audio is actually generated from IPA as input.

It does to me. I want to know what audio the IPA represents, not what Google (or anyone else) thinks the text represents; I have adequate browser-based tools for that.

The Phabricator ticket which I opened and which led to that wish, T298950, is equally explicit in its intention: "We should have a tool that makes IPA transcriptions (e.g. /ˈbɜːrmɪŋəm/ for "Birmingham") hearable as audio." as is an earlier, ticket, T33221: "...using the existing IPA in each page (in Wiktionary, and some in Wikipedia and elsewhere) to auto-generate a sound file...". None of these refer to wanting a text-to speech renderer.

It doesn't make a lick of difference whether the audio is actually generated from IPA as input.

It does to me. I want to know what audio the IPA represents, not what Google (or anyone else) thinks the text represents; I have adequate browser-based tools for that.

/ˈbɜːrmɪŋəm/ does not represent any audio. It is a diaphonemic notation, and as Help:IPA/English, to which such notations all over English Wikipedia are linked, explains with great pains, it represents not any one pronunciation but a whole host of accents'. Whatever your "browser-based tools" are, they're not teaching you "what audio the IPA represents" because that is a categorical impossibility. The chances are what you're hearing is precisely what whatever engine the tools use thinks the text represents, just like Phonos.

(EDIT: It seems you might have meant that you don't need to know what an engine thinks the text represents because you already have tools for that, and you want to know what audio the IPA represents. In either case, my point stands: such a tool is impossible because the vast majority of IPA transcriptions represent abstractions and not precise qualities.)

And it is clear from the project page on Meta that "asking for a feature that made trying to read IPA redundant as readers" is representative of how CommTech interpreted your proposal. That may not have been your intent as the proposer, but the ship has sailed a long time ago, and they're not letting their hard work thus far go to waste (nor should they).

such a tool is impossible

And yet a 'wish'; for such a tool was accepted. As I have recently told you elsewhere: "If an IPA-to-audio renderer is not possible, the request should have been - and indeed still should be - declined."

...how CommTech interpreted your proposal

If so, then CommTech did so wrongly. I am, though, at a loss to know they - or anyone - could read the ticket quoted above and do so.

Then consider your proposal declined, and WMF is off to developing something, albeit inspired by the proposal, entirely different.