Use XPath to get text nodes related to utterances
Closed, ResolvedPublic1.5 Estimated Story Points
Actions

Description

The current method of getting the text nodes related to an utterance uses a path made up of indices, e.g. [1, 0, 3]. it looks like i should be possible to replaced this by XPath-expressions, with all the benefits of using a standard implementation.

So lets do that.

Details

	Subject	Repo	Branch	Lines +/-
	Use XPath to get text nodes for utterances.	mediawiki/extensions/Wikispeech	master	+223 -731

Customize query in gerrit

Related Objects
Search...

		Status	Subtype	Assigned	Task
		Resolved		Sebastian_Berlin-WMSE	T158954 Use XPath to get text nodes related to utterances
		Resolved		Sebastian_Berlin-WMSE	T148622 Highlight recited sentence

Event Timeline

Sebastian_Berlin-WMSE created this task.Feb 24 2017, 11:07 AM

Sebastian_Berlin-WMSE added a subtask: T148622: Highlight recited sentence.

Sebastian_Berlin-WMSE moved this task from Backlog to This Week on the User-Sebastian_Berlin-WMSE board.Mar 2 2017, 10:27 AM

Sebastian_Berlin-WMSE claimed this task.Mar 2 2017, 10:32 AM

While not necessary for T148623: Highlight recited word, I think it's better to take a look at this first.

I have an implementation of this that works, but still needs a bit of clean up and updating of tests. This will wait until T148622: Highlight recited sentence is done, to minimize extra work with merging.

A highlight of using XPath is that all the code in Cleaner that deals with extracting tags from the HTML is no longer needed.

Sebastian_Berlin-WMSE updated the task description. (Show Details)Mar 3 2017, 11:06 AM

Sebastian_Berlin-WMSE edited projects, added Wikispeech (Sprint 2017-02-22); removed Wikispeech.

Sebastian_Berlin-WMSE moved this task from Backlog to In progress on the Wikispeech (Sprint 2017-02-22) board.

Jopparn closed subtask T148622: Highlight recited sentence as Resolved.Mar 6 2017, 10:14 AM

Jopparn moved this task from This Week to Backlog on the User-Sebastian_Berlin-WMSE board.Mar 6 2017, 10:31 AM

Lokal_Profil added a project: Wikispeech-WMSE.Mar 8 2017, 9:18 AM

Note the change of scope for the task.

Worked on in Wikispeech (Sprint 2017-02-22):

Preliminary investigation and basic implementation

To do in Wikispeech (Sprint 2017-03-08):

Update tests
Proper implementation

Lokal_Profil moved this task from Backlog to In progress on the Wikispeech (Sprint 2017-03-08) board.Mar 8 2017, 9:25 AM

Sebastian_Berlin-WMSE moved this task from Backlog to This Week on the User-Sebastian_Berlin-WMSE board.Mar 8 2017, 10:56 AM

While not directly related to this task, I discovered that CleanedTags are not needed currently. They were required for calculating character positions in the original HTML. Some representation of tags is likely needed for T133689: Recognise certain tags, notify user and allow interaction (was: Pling on navigation), but only for the elements that should give some kind of feedback.

Change 342024 had a related patch set uploaded (by Sebastian Berlin (WMSE)):
[mediawiki/extensions/Wikispeech] Use XPath to get text nodes for utterances.

https://gerrit.wikimedia.org/r/342024

gerritbot added a project: Patch-For-Review.Mar 9 2017, 3:17 PM

Sebastian_Berlin-WMSE mentioned this in rEWISbc194a62a763: Use XPath to get text nodes for utterances..Mar 9 2017, 3:21 PM

Sebastian_Berlin-WMSE mentioned this in rEWIS53074f6532bc: Use XPath to get text nodes for utterances..Mar 10 2017, 3:15 PM

Lokal_Profil added a project: User-LokalProfil.Mar 13 2017, 10:17 AM

Lokal_Profil moved this task from 📥 Backlog to 🔬 to Review on the User-LokalProfil board.

Sebastian_Berlin-WMSE mentioned this in T159545: Unicode characters increase length of highlighting.Mar 15 2017, 12:39 PM

Sebastian_Berlin-WMSE mentioned this in rEWIS69dd11f45fc4: Use XPath to get text nodes for utterances..Mar 15 2017, 3:43 PM

Worked on in Wikispeech (Sprint 2017-03-08):