Page MenuHomePhabricator

Buggy data-mw for templated content if it produced an unclosed table.
Closed, ResolvedPublic

Description

mznwiki:بوئین_و_میاندشت (oldid=7203) has a roundtrip diff where the last 2 characters are being swallowed. This bug got exposed because of https://gerrit.wikimedia.org/r/#/c/244588/ after a roundtrip test run.

Arlo and Scott figured out that this is because of a bug in migrateTrailingNLs that is moving newlines out of the table ( and effectively out of the template with the unclosed table) which throws off the DSR computation algorithm.

  • The template produces an unclosed table "{|\n...\n|-"

*. All top-level page content after the transclusion now ends up inside the table.

  • Since this content is inside a <tr>, all the content (except for the newlines) get fostered before the table.
  • migrateTrailingNLs moves the above newlines out of the table which end up as next siblings of the table.
  • DSR algo now sees a DOM which has part of top-level content (all the categories on that page) in forstered position before the table, and part of top-level content (newlines between the categories) after the table. So, the <table> gets a DSR value that includes everything except for the newlines.
  • Template wrapping marks all the fostered content and table as being template-produced, but the newlines are left out .. The size of the omitted newlines is the difference in lost characters in data-mw.

After a lot of discussion, we've discovered that the bug here is that the newlines should never have left the table since the content they were attached to are still "in" the table (they've actually been fostered *before* the table, but that's still considered "in" the template-affected area). The general bug here is that whenever content is fostered out of a table, whitespace and comments attached to that fostered content should remain in their original node so they don't inadvertently leave a template-affected scope.

Event Timeline

ssastry assigned this task to cscott.
ssastry raised the priority of this task from to Medium.
ssastry updated the task description. (Show Details)
ssastry added a project: Parsoid.
ssastry subscribed.

Change 245603 had a related patch set uploaded (by Cscott):
Disable migrateTrailingNLs if table has had content fostered out of it

https://gerrit.wikimedia.org/r/245603

cscott set Security to None.

Change 245603 merged by jenkins-bot:
Disable migrateTrailingNLs if table has had content fostered out of it

https://gerrit.wikimedia.org/r/245603