User Details
- User Since
- Nov 24 2016, 4:00 PM (286 w, 2 d)
- Availability
- Available
- LDAP User
- Unknown
- MediaWiki User
- Cyrta [ Global Accounts ]
Sep 6 2018
What is the contract for me about Monumental ?
Dec 16 2016
There is no task related to "promptest" creation.
Where is listing of utterances to be recorded ?
There is very good software already available for both TTS and ASR recording
Yes, but also the "page" is going to indicate language but not voice,
- language
- voice
- utterance
That could be sufficient. And maybe:
- page
- text coordinates on page
Simple Redis, Memcached could be used to store indexes and paths to the files.
using simple key-value store, we can cache responses on server side too,
TTS result would be store in file and indexed according to the page and text coordinates + utterance itself.
Dec 9 2016
Nov 24 2016
I think we can write to prof. Tanja Schulz and ask if she can give us access to this RTAL tool.
In every paper it is stated it is free tool.
in " web-based tools and methods for rapid pronunciation dictionary creation "
As shown in Fig. 2 Wiktionary pages may contain more than one pronunciation per word. These additional pro- nunciations reflect alternate pronunciations, dialects or even different languages. To gain some insights into this “language-mix” we performed a brief analysis on the Eng- lish, French, German, and Spanish Wiktionary editions. For German Wiktionary, for example, we found that only 67% of the detected pronunciations are for German words, the remainder is for the languages Polish (10%), French (9%), English (3%), Czech (2%), Italian (2%), etc. Fig. 3 shows this “language-mix” in the English and the French editions.
There were tools for that to crawl wikipedia or get data from expedia
made by team of Tanja Schulz from Institute for Anthropomatics, Cognitive Systems Lab (CSL), Karlsruhe Institute of Technology (KIT),