Page MenuHomePhabricator

Reply tool messes up replying to users with signatures followed by html comments
Open, Needs TriagePublic

Description

Background:
On enwiki, SineBot adds missing signatures for ips and new users. The bot substitutes a template that notes the comment was unsigned, and includes the editor and the time of the comment. After the template, the bot notes <!--Autosigned by SineBot-->

User story:
As someone using the reply tool,
When I reply to a post that was autosigned by SineBot,
My reply should go after the <!--Autosigned by SineBot--> comment

Example: http://en.wikipedia.org/wiki/Special:Diff/966900777

Event Timeline

Restricted Application added a project: User-DannyS712. · View Herald TranscriptJul 10 2020, 7:36 AM
Restricted Application added a subscriber: Aklapper. · View Herald Transcript
DannyS712 moved this task from Unsorted to Reports on the User-DannyS712 board.Jul 10 2020, 7:37 AM
matmarex claimed this task.Jul 22 2020, 6:46 PM

Change 615551 had a related patch set uploaded (by Bartosz Dziewoński; owner: Bartosz Dziewoński):
[mediawiki/extensions/DiscussionTools@master] Better handle HTML comments following replies

https://gerrit.wikimedia.org/r/615551

Change 615551 merged by jenkins-bot:
[mediawiki/extensions/DiscussionTools@master] Better handle HTML comments following replies

https://gerrit.wikimedia.org/r/615551

ppelberg closed this task as Resolved.Jul 28 2020, 2:05 AM

Nice.

JTannerWMF reopened this task as Open.Wed, Sep 9, 4:43 PM
JTannerWMF added a subscriber: JTannerWMF.

In our Planning meeting the editing team decided to reopen this task because the issue reoccurred

I can reproduce this iff the comment being replied to is unindented and the signature is wrapped in a <small> tag:

The issues appears to be whitespace between the end of the paragraph and the comment. Parsoid appears to put the whitespace in an inconsistent place. In the list case Parsoid appears to be losing the whitespace completely, which was helping us but is a bug:

wikitext
signature1    <!--comment-->

<small>signature2</small>    <!--comment-->

: signature3    <!--comment-->
: <small>signature4</small>    <!--comment-->
HTML
<p>signature1    </p><!--comment-->

<p><small>signature2</small></p>    <!--comment-->

<dl><dd>signature3<!--comment--></dd>
<dd><small>signature4</small><!--comment--></dd></dl>

Case signature2 has whitespace after the </p> which is triggering the bug

CC @ssastry, @cscott who may want to look into the dropped whitespace and inconsistencies.

Esanders claimed this task.Fri, Sep 11, 1:02 PM
Esanders added a subscriber: matmarex.

Change 626689 had a related patch set uploaded (by Esanders; owner: Esanders):
[mediawiki/extensions/DiscussionTools@master] Skip over whitespace when looking for trailing comments

https://gerrit.wikimedia.org/r/626689

ssastry added a comment.EditedFri, Sep 11, 3:16 PM

"Death by a thousand whitespace cuts" is what we should name the tracking bug.

In the list case Parsoid appears to be losing the whitespace completely, which was helping us but is a bug:

It is not. See T157418: RFC: Make some aspects of Tidy's whitespace stripping behavior part of wikitext parsing "spec". The core parser strips comments from source which leaves the list item with whitespace which Tidy would then strip out. So, Parsoid effectively mimics that behavior as per T157418.

wikitext
signature1    <!--comment-->

<small>signature2</small>    <!--comment-->
HTML
<p>signature1    </p><!--comment-->

<p><small>signature2</small></p>    <!--comment-->

Case signature2 has whitespace after the </p> which is triggering the bug

Regarding the inconsistency, the paragraph wrapping code is a mess because of T134469: doBlockLevels() inserts <p> and </p> randomly with no regard for HTML validity and the need to deal with whitespace, comments, templates, rendering-transparent metadata like categories, blah. But, I suspect the "state machine" logic does whatever is expedient wrt whitespace & comments wrt to where to insert the </p> tag in the buffered token stream. We can tweak it if necessary. But, if nothing is broken on your end because of it, I would much rather not worry about it on our end.

Change 626689 merged by jenkins-bot:
[mediawiki/extensions/DiscussionTools@master] Skip over whitespace when looking for trailing comments

https://gerrit.wikimedia.org/r/626689

But, if nothing is broken on your end because of it, I would much rather not worry about it on our end.

I'm assuming nothing is broken on our end considering this now seems to be functioning as we expect [i] – would this be accurate for me to think, @Esanders ?


i. https://en.wikipedia.beta.wmflabs.org/w/index.php?title=Talk%3ACats&type=revision&diff=450301&oldid=450300

Yes, it should be fine.