HTML5 Audio Read-Along uses <span> tags for each word to highlight them. Each <span> has a few attributes, like so:
<span class="" data-index="1" tabindex="0" data-dur="0.28" data-begin="0.929">those</span>
These tags need to be automatically generated from the response from the synthesis API.
Description
Description
Status | Subtype | Assigned | Task | ||
---|---|---|---|---|---|
Open | None | T152430 Run Wikispeech offline | |||
Resolved | HaraldBerthelsen | T143644 Multiple requests to TTS server should not cause delay | |||
Resolved | Jopparn | T151786 Publically accessible demo (player) [Stage 1+2] | |||
Resolved | Sebastian_Berlin-WMSE | T122158 Highlight recited text (was: Display the read word) | |||
Invalid | None | T134750 [Task] Generate tags with time information (Wikispeech) | |||
Resolved | Sebastian_Berlin-WMSE | T140105 Map TTS response to page HTML |
Event Timeline
Comment Actions
The time information is stored under the utterances element, rather than in conjunction with the actual HTML substrings. It may not be necessary to add spans for each token, see T122158#2651737.