Page MenuHomePhabricator

The experimental highlighter should not break unicode surrogate pairs when cutting the snippet.
Closed, ResolvedPublic

Description

As reported in https://github.com/wikimedia/search-highlighter/issues/32 the highlighter may blindly cut some surrogate pairs when delimiting the search snippet.

Details

Related Gerrit Patches:

Event Timeline

dcausse created this task.Nov 5 2018, 4:12 PM
Restricted Application added a subscriber: Aklapper. · View Herald TranscriptNov 5 2018, 4:12 PM
Restricted Application added a project: Discovery-Search. · View Herald TranscriptNov 5 2018, 4:12 PM
EBjune renamed this task from The experimental highlighter should bot break unicode surrogate pairs when cutting the snippet. to The experimental highlighter should not break unicode surrogate pairs when cutting the snippet..Nov 6 2018, 4:42 PM
dcausse claimed this task.Nov 6 2018, 4:59 PM
dcausse triaged this task as Normal priority.
dcausse moved this task from needs triage to Current work on the Discovery-Search board.

Change 472510 had a related patch set uploaded (by DCausse; owner: DCausse):
[search/highlighter@master] Do not break surrogate pairs when extracting snippets

https://gerrit.wikimedia.org/r/472510

Change 472510 merged by Gehel:
[search/highlighter@master] Do not break surrogate pairs when extracting snippets

https://gerrit.wikimedia.org/r/472510

Change 473026 had a related patch set uploaded (by DCausse; owner: DCausse):
[search/highlighter@5.5] Do not break surrogate pairs when extracting snippets

https://gerrit.wikimedia.org/r/473026

Change 473026 merged by jenkins-bot:
[search/highlighter@5.5] Do not break surrogate pairs when extracting snippets

https://gerrit.wikimedia.org/r/473026

debt closed this task as Resolved.Nov 29 2018, 7:55 PM