Page MenuHomePhabricator

Update parsoid-rs for new redlink handling
Closed, ResolvedPublic

Event Timeline

Legoktm triaged this task as Unbreak Now! priority.Nov 30 2022, 5:01 PM
Legoktm created this task.

For reference, this was caught by the test suite:

thread 'test_iterators' panicked at 'assertion failed: `(left == right)`
  left: `"./Sentence?action=edit&redlink=1"`,
 right: `"./Sentence"`', parsoid/tests/parsoid.rs:224:5

---- test_noinclude_children stdout ----
thread 'test_noinclude_children' panicked at 'assertion failed: `(left == right)`
  left: `"Bar?action=edit&redlink=1"`,
 right: `"Bar"`', parsoid/tests/parsoid.rs:559:5


failures:
    test_iterators
    test_noinclude_children

No, I messed up, I'll yank 0.7.3 and work on a fix tomorrow:

---- tests::test_extract_tfa_title stdout ----
thread 'tests::test_extract_tfa_title' panicked at 'assertion failed: `(left == right)`
  left: `"SMS Zähringen"`,
 right: `"SMS Z%C3%A4hringen"`', legoktm/tfa-protector-bot/src/main.rs:293:9

Looks like the url parsing standard says: Code points greater than U+007F DELETE will be converted to percent-encoded bytes by the URL parser. which is how this seems to happen.

This patch should fix it. My cargo, test, and rust skills aren't strong enough to figure out where / how to add tests. But, I verified that featured_articles run now completes (whereas before it would fail on the second file).

And, unrelated to this, but since I noticed a suppressed test in your test files, that will be addressed by https://gerrit.wikimedia.org/r/c/mediawiki/services/parsoid/+/862176 that fixes T309024

LGTM, thanks! Applied and released.