Page MenuHomePhabricator

Parsoid breaks cross-page hyphen joining in ProofreadPage
Closed, DuplicatePublicBUG REPORT

Description

Steps to replicate the issue:
Go to https://en.wikisource.org/wiki/Weird_Tales/Volume_6/Issue_2/The_Oldest_Story_in_the_World?useparsoid=1 or any transclusion where one word was split across two pages in the source with a hyphen (here between https://en.wikisource.org/wiki/Page:Weird_Tales_Volume_6_Number_2_(1925-08).djvu/24 and the next page)

What happens?:
The word is broken in two ("it was necessary to travel very care- fully because").

What should have happened instead?:
ProofreadPage has since 2018 a functionality that joins these cross-page hyphens, so that eg the above page instead reads " it was necessary to travel very carefully because".

It appears that parsoid is breaking that functionality. This bug already affects WS export.

(Reporting from https://en.wikisource.org/wiki/Wikisource:Scriptorium#Automatic_removal_of_hyphens_doesn't_work_in_ws-export.)