User Details
- User Since
- Apr 28 2016, 9:04 AM (502 w, 23 h)
- Availability
- Available
- LDAP User
- Unknown
- MediaWiki User
- HaraldBerthelsen [ Global Accounts ]
Mar 11 2021
I set up a proxy and changed Services -> lexicon in the wikispeech_server config file to localhost/lexserver, it works. And the links work as well, with the lexserver prefix.
Not sure if it will work with the setup on wikilabs - but with a bit of luck it will.
Need to check this..
In my setup using the -r lexserver switch breaks the links in http://localhost:8787/lexicon/, but wikispeech_server works normally. But of course there is no proxy in use. I'll set that up and see if I get the same error as in the log above.
Feb 15 2021
There are now test examples in templates/test.html, of ssml with ipa, and of the new "ipa" input type.
To check which ipa symbols are allowed for a certain language: http://localhost:8771/symbolset/content/sv-se_ws-sampa (the url to the symbolset mapper)
Feb 10 2021
Test version available in git - allows ipa in ssml input
Jan 27 2021
Oct 15 2020
Oct 1 2020
Sep 17 2020
Need to update to latest version of mishkal
Sep 8 2020
Changed to use
logging.handlers.SysLogHandler(address = '/dev/log')
instead of
logging.StreamHandler()
Added a setting to wikispeech-server/default.conf:
#logger defaults to stderr, uncomment here if you want something else
#logger: syslog
#logger: /tmp/wikispeech.log
There is now an endpoint /default_voices that returns json similar to this:
Output json now contains timing in milliseconds instead of seconds
audio_url is no longer returned.
Tmp dir is cleared after each call.
Sep 3 2020
Sep 2 2020
Jul 20 2020
Yes, wrong format was returned. Now fixed in git.
Jul 10 2020
Json item "voice", with items adapter, config_file, engine, language, name has been added.
Apr 2 2020
Partly done - a call to the wikispeech server, eg. http://localhost:10000/?lang=sv&input=sju%20sj%C3%B6sjuka%20sj%C3%B6m%C3%A4n. now also contains audio_data, a base64 encoded string. The audio_url still there as well, will remove that later. For testing you can also try adding the parameter output_type=html, eg. http://localhost:10000/?lang=sv&output_type=html&input=sju%20sj%C3%B6sjuka%20sj%C3%B6m%C3%A4n.
This will return very simple html with an audio element containing the base64 string as source.
Mar 19 2020
I can't reproduce this - here's what I get when starting MaryTTS without a running mishkal server:
Feb 6 2020
Nov 28 2019
Nov 26 2019
The reason for the "spike" is still unknown. But the low volume after is caused by the "Amplitude Normaliser" in Marytts.
Nov 25 2019
No longer relevant. Close.
But if we're still using marytts we still need to consider the speed issue later.
Already fixed. Close.
Already fixed. Close.
Seems fine - open again if something like this happens again!
This is part of the (complicated) question of how to deal with ambiguity in pronunciation and part-of-speech tags. If there is a "real" pos-tagger at work, or if pos-tags are set by the user, these should be used to select pronunciation. Otherwise the lexicon might just as well set it. Changed (test) to set part-of-speech in the same way as pronunciation has been set before: Use the first entry in the lexicon, unless another one is explicitly set as preferred.
Nov 14 2019
Dec 13 2017
Dec 6 2017
Words with > (glottal stop) in transcription can now be played.
Dec 5 2017
I think Hanna has to upload a new version to Docker hub.
But it should also work to run the wikispeech server directly from the source cloned from github (I'm doing that).
Tokenisation problem in Arabic marytts. Is it even wrong? The input is very unlikely for an Arabic synthesis. However there could definitely be a better error message!
Maybe the problem is in handling the various types of quotes used in Arabic? Or just that quotes work with Arabic script input but not with Latin script.
This works:
وكتابه "نهاية الإيجار في دراية الإعجاز" يعتبر من المراجع البلاغية المهمة.
But the example in this ticket doesn't and neither does this:
"Bob Dylan"
Yes, English marytts does this. English Flite says "hash" followed by the number. Swedish marytts actually just skips the whole "#N" expression.
So for now what you can do is to change the pronunciation using ssml.
But it also reraises a question we discussed before: Should there be a preprocessing step with e.g. regular expressions to modify input? Maybe a good idea.
Dec 4 2017
Corrected in git source.
Needs more testing with various kinds of input.
The input text above now gives:
Changed in git source
Nov 13 2017
changed to WARNING
Nov 9 2017
No, not too big, the version on github now has a fix for it.
Nov 7 2017
I'm using the current from github. But not too sure about when these parts were last changed, I think it was long ago, so maybe the problem isn't there at all. Could be an encoding issue?
Anyway the right thing to do isn't to replace tokens back to original form as an afterthought, as it is now, but to either send the tokens from the client or to make sure that the original input is kept and returned from the server.
Nov 6 2017
Hm.. The example works for me (see json output below).
But that doesn't mean that it's actually solved in a sensible way. It's just a guess at what the client wants. Bound to go wrong in complicated cases like this, with text going in both directions and unusual punctuation characters.
The specific problem in your case is in the word ومرن ، which has an upside down comma after it, you can see the comma as a separate token in the output token list but not in the input.
So even if you can update so that we use the same version, and solve this problem, we should still think of a better way of solving the issue.
What call are you using for this?
It seems to work for me, so maybe we are doing it differently somehow. Or else my tests just happen to work. Can you give an example of a call that fails?