Page MenuHomePhabricator

Links and "'s" causes token generation to fail
Closed, InvalidPublic4 Estimated Story Points

Description

A link (probably other elements too) followed by "'s" causes the token generation to fail. E.g. the utterance:
<p><a href="/w/index.php?title=Bob_Johnston&amp;action=edit&amp;redlink=1" class="new" title="Bob Johnston (page does not exist)">Bob Johnston</a>'s</p>
only creates token for "Bob".

Event Timeline

Lokal_Profil set the point value for this task to 4.Feb 21 2017, 2:28 PM
Lokal_Profil added a subscriber: Lokal_Profil.

See test page for example.

The more general problem was handling a token that is broken up by a cleaned tag. E.g. from the original HTML:
tok<hr />en

I have a solution that should fix this, but it will need some work after T148622: Highlight recited sentence, since it still uses positions and not paths.

I can't reproduce this with the current revision (aca4abacc4a41145d3f30ba5434416aed4018896) on the test wiki. The reason for this is that the tokens are just added without looking in the original text nodes.