Summary from DevSummit on Annotations
- Decided to use OpenAnnotation (json spec) for storing wikispeech annotations (to ensure future compatibility)
- Storage will probably end up relying on Multi-content revision, but we are likely a year away from being able to build an Annotation Service on top of this.
- FileAnnotations will start work on an Annotation service in the comming months (possibly as a separate extension).
Work-around until an annotation service is live
Without a proper annotation service we will probably be limited to sticking tags into the wikitext itself. This will be a blocker for having it as a beta function (since it influences content visible to everyone). However it will allow us to develop the surrounding components (editing interface etc) and will illustrate (in the demo) how editing would work. A semi open demo would additionally actually improve the underlying language data which can be reused later.
It then becomes important to clearly separate the different editing components so that as much as possible can be reused once we switch to an annotation service.
- The component for combining the annotation data with the page content.
- Storing an annotation (needs to be stored as a tag surrounding the selected word(s))
- Accessing annotations from outside of the page. E.g. mechanism for suggesting lexical changes (additions) based on frequently annotated word(s) with non-standard pronunciations. (and changing these to a pronunciation variant if it is added to the lexicon).
- Trigger for page categorisation based on annotations (if desired)
- Logic for deciding when to drop an annotation.
- Defining the tag in the software itself (so that Wikispeech knows to handle <wikispeech> in a certain way. Possibly part of how annotation data is combined with page content. See Tag extensions.
Ensure the following are not affected:
- The structure/encoding of the annotation itself (should use OpenAnnotation)
- Cleaning (we just need to ensure <wikispeech>is not completely removed)
- The generated utterances (which are sent to the TTS service) should look the same independent of the annotation mechanism. I.e. all combining of annotations and text need to be handled in an earlier stage.
- The editing interface (this should just produce the annotation data and link it to a range in the text).
- Lexicon changes (i.e. that a word should always be pronounced differently)
Suggested tag syntax:
Hej små <wikispeech data=”<annotation_data>”>knattar</wikispeech>
Player demo (T151786) should be ready towards the end of January. Once that is live we will start looking at editing.
- Overview. Which components do we need and how will they interact. Clearly mark any components which will be affected by the change in the annotation service. Note that this stage is extra important to spend time on since we can save some time on the components we know are temporary, but want as few of these as possible. A first overview identifies the following stages: Create, save, realise, (communicate), change [an annotation] and change lexicon.
- Identify what SSML-tags (e.g. for IPA transcriptions) and TTS-instructions (e.g. “use pronunciation variant #2") that need to be supported by the TTS (T148635). Doesn’t block any development (except definition of data structure), but would be good to have done so that Wikispeech-STTS can work on the TTS.