Page MenuHomePhabricator

Unable to respond to specific comments
Closed, ResolvedPublic

Description

As reported here [1], it is not possible to reply to the following comments posted to this talk page [2]:

  • Comment posted at 16:30, 10 December 2019 (UTC)
  • Comment posted at 02:50, 12 December 2019 (UTC)

Expected behavior

  1. Go to: https://en.wikipedia.beta.wmflabs.org/wiki/Talk:Cats
  2. Click the "Reply" link appended to either of the following comments:
    • Comment posted at 16:30, 10 December 2019 (UTC)
    • Comment posted at 02:50, 12 December 2019 (UTC)
  3. Compose a reply
  4. Click "Reply"
  5. ✅ Notice your comment has successfully been published to the talk page

Actual behavior

  1. Go to: https://en.wikipedia.beta.wmflabs.org/wiki/Talk:Cats
  2. Click the "Reply" link appended to either of the following comments:
    • Comment posted at 16:30, 10 December 2019 (UTC)
    • Comment posted at 02:50, 12 December 2019 (UTC)
  3. Compose a reply
  4. Click "Reply"
  5. ⚠️ Notice a yellow rectangular highlight appears where one would expect their comment to be posted, but your comment has not been posted to the talk page and no trace of the comment you just "posted" is shown in the talk page's History page.

Additional context

The person who first reported this issue, @Alsee, noted this earlier edit might be the cause of the inability to reply the comments above.


  1. https://www.mediawiki.org/w/index.php?title=Topic:Vcwvt3bq03o5gv8h&topic_showPostId=vd8593amh18c3myw#flow-post-vd8593amh18c3myw
  2. https://en.wikipedia.beta.wmflabs.org/wiki/Talk:Cats

Event Timeline

I guess somebody cleaned up the page since this task was filed, as I don't see the comment a 02:50. I see the comment at 16:30 (the first comment on the page) and I can reproduce the problem with it, though. For reference, I was testing on this revision: https://en.wikipedia.beta.wmflabs.org/w/index.php?title=Talk:Cats&oldid=410946

The beginning of the page looks like this:

image.png (1×2 px, 372 KB)
1== How can we make this article better? ==
2
3this article is a good start, but I think we could add some deeper topics and information here. [[Special:Contributions/10.0.3.1|10.0.3.1]] 16:30, 10 December 2019 (UTC)
4
5: Is there a cat show, something similar to the [[National_Dog_Show]]? This would be a useful type of event to reference on this page. [[User:Yatu|Yatu]] ([[User talk:Yatu|talk]]) 19:21, 11 December 2019 (UTC)
6:: There is a national Cat Show. it's held every year in Pasadena in conjunction with the Rose parade. It specifically is for people who are not into football. [[Special:Contributions/192.168.122.1|192.168.122.1]] 21:32, 11 December 2019 (UTC)
7:test [[User:Yatu|Yatu]] ([[User talk:Yatu|talk]]) 19:27, 27 December 2019 (UTC)
8::testy [[Special:Contributions/68.195.105.134|68.195.105.134]] ([[User talk:68.195.105.134|talk]]) 08:15, 31 December 2019 (UTC)
9:::Hello.<ref name="1nbsp;&nbsp;2">Source one, source two</ref> Goodbye. [[Special:Contributions/68.195.105.134|68.195.105.134]] ([[User talk:68.195.105.134|talk]]) 08:15, 31 December 2019 (UTC)
10::::Hello2 [[Special:Contributions/68.195.105.134|68.195.105.134]] ([[User talk:68.195.105.134|talk]]) 08:16, 31 December 2019 (UTC)
11:::::hello3 [[Special:Contributions/68.195.105.134|68.195.105.134]] ([[User talk:68.195.105.134|talk]]) 08:16, 31 December 2019 (UTC)
12::proud to prototype =) [[link]] and with Eve? [[Special:Contributions/85.203.67.228|85.203.67.228]] ([[User talk:85.203.67.228|talk]]) 01:12, 4 January 2020 (UTC)
13:Reply-5 [[Special:Contributions/68.195.105.134|68.195.105.134]] ([[User talk:68.195.105.134|talk]]) 08:22, 31 December 2019 (UTC)
14::{{#if:x|<span style="}} font-weight: bold">red test 2. [[Special:Contributions/68.195.105.134|68.195.105.134]] ([[User talk:68.195.105.134|talk]]) 08:33, 31 December 2019 (UTC)
15:::{{welcome}} [[User:Ppelberg-test|Ppelberg-test]] ([[User talk:Ppelberg-test|talk]]) 20:01, 5 January 2020 (UTC)

Note the big mis-indented block at the end, generated by the {{welcome}} template.

When trying to reply to the comment, the reply is inserted according to the indentation, which happens to be in the middle of the content generated by the {{welcome}} template:

image.png (1×2 px, 404 KB)

Since we can't represent this in wikitext, the reply then is lost when saving.

Change 564854 merged by jenkins-bot:
[mediawiki/extensions/DiscussionTools@master] Pick reply insertion point based on parser tree, not DOM tree

https://gerrit.wikimedia.org/r/564854

@matmarex it looks like the code for where to place the reply is running into trouble because it's trying to follow HTML/parser structure instead of the comment structure. People don't view the structure the way a formal parser does. You need to step away from the usual parsing expectations and focus on what humans are keyed-into.

I was looking at this test case linked on your gerrit patch. The result was extremely confusing - I can't imagine a human posting like that. I only figured what was going on after working out in reverse why the test case caused difficulty.

Consider this reasonably common example:

Comment 1. Signature.
:Comment 2. Signature.
::Comment 3 paragraph 1.
::Comment 3 paragraph 2.
TEMPLATE, TABLE, IMAGE, BLOCKQUOTE, or other content posted at zero indent.
::Comment 3 paragraph 3. Signature.
New replies go here, regardless of whether it is a reply to comment 1, comment 2, or comment 3.

I'm trying to illustrate that a comment begins on the line after a signature, and it runs to the end of the next signature. Any markup or change of indentation in between is irrelevant. That's what's tripping up the code - it's seeing indentation changes inside the comment. The fact that we drop to zero indentation in the middle of a comment doesn't change the fact that this comment was posted at level 2 indentation. My example will closely match your test case if we remove paragraph 3, leaving the signature on the final zero-indent line. That doesn't change where new replies should land - below comment 3. It looks like your latest code would post above comment 3, tripped up by the internal indentation.

[...] a comment begins on the line after a signature, and it runs to the end of the next signature. Any markup or change of indentation in between is irrelevant.

While I absolutely agree with this, I don't think this particular issue will matter as much in the very first versions. I would assume that the extension would first be enabled as a Beta Feature on only a few projects, which would mean that users would largely be choosing to use this interface intentionally and would expect some things to break (similar to the acceptance of the relatively high failure rate of Enterprisey's reply-link user script). Of course, it would still be worrying if this issue doesn't get resolved.

[...] In the longer term [...], we plan to introduce new syntax for multiline list items (T230683)

On the other hand, it will no longer be necessary to insert block content without indentation if multiline comments/list items are implemented. User adoption of the syntax would still be an issue, but using the multiline syntax would almost certainly become the preferred way to insert block content, and the impact of this issue (if it doesn't get fixed) would be lessened.

@matmarex it looks like the code for where to place the reply is running into trouble because it's trying to follow HTML/parser structure instead of the comment structure. People don't view the structure the way a formal parser does. You need to step away from the usual parsing expectations and focus on what humans are keyed-into.

It was, but the patch above fixes it. It follows the comment structure now. (I realize now that the "parser tree" in my commit message is ambiguous, so to clarify, this refers to DiscussionTools' internal parser – which builds a tree structure of comments and replies from the page's HTML – not the the wikitext parser.)

I'm trying to illustrate that a comment begins on the line after a signature, and it runs to the end of the next signature. Any markup or change of indentation in between is irrelevant. That's what's tripping up the code - it's seeing indentation changes inside the comment. The fact that we drop to zero indentation in the middle of a comment doesn't change the fact that this comment was posted at level 2 indentation. My example will closely match your test case if we remove paragraph 3, leaving the signature on the final zero-indent line. That doesn't change where new replies should land - below comment 3. It looks like your latest code would post above comment 3, tripped up by the internal indentation.

Yes, it would. But the reason for this is slightly different than you're guessing.

If a multi-line comment has a change in indentation in the middle, like in the example you gave, this problem actually doesn't occur. As long as the first and last line of the comment have the same indentation, it's unambiguous, and we will correctly pick the indentation and position of any replies.

But if the indentation is different between the first and last line (in particular, if there are only two lines), we have to make a guess as to whether you meant to indent your entire comment and forgot the */: on subsequent lines, or whether you mean to actually include a bullet list as a part of your comment. Right now the guessing algorithm is just "take the lower indentation level of the two" (https://github.com/wikimedia/mediawiki-extensions-DiscussionTools/blob/7de6b4e04ac76ab8aa033dc944af216e1e60750a/modules/parser.js#L716), and in the example I provided on the patch, that results in the comment being treated as a top-level comment (indent level 0).

DiscussionTools has a debug mode that visualizes how it parses comments, I posted the two examples on a talk page on Beta so that we can have a look: https://en.wikipedia.beta.wmflabs.org/wiki/Talk:T241391?dtdebug=1 It's a bit messy, but if you look closely at the red guides on the left, in example 1 the comment "ccc" is marked as a top-level comment, while in example 2 it's marked as a reply to comment "bbb" (on the same level as comment "ddd").

Probably the guessing could be improved to be correct more often (feel free to propose ideas). I just didn't have time to think about it more yet, and just picked the simplest reasonable algorithm to keep the code understandable. But this isn't a fundamental limitation, it's just a (relatively) small bug. But ultimately I think this case will always be ambiguous and we won't be able to guess correctly 100% of the time.

it will no longer be necessary to insert block content without indentation if multiline comments/list items are implemented.

First, the syntax proposal currently appears unclear and uncertain. This project should start from a presumption of working on existing pages, and any major new syntax proposal separately considered by the community.

However more importantly it appears that you misunderstood the intent of my example. It looks like you interpreted the zero-indent as some flaw or limitation. It wasn't. It is uncommon but not unusual for my comments to deliberately drop to zero indent in the middle. New syntax would not change that. The new tool should support posting arbitrary indentation in a comment, and regardless, it needs to behave well around such comments.

I don't think this particular issue will matter as much in the very first versions.

Early versions will of course be incomplete, but it still appears that my suggested change is still needed. Even before Flow we built we knew a key challenge was the software's ability to understand the structure of our discussion pages. I think ignoring the internals of each comment is a key insight to cracking that problem.

  1. Parse the wikitext to deal with things like {{templates}} and <!--comments-->
  2. use signatures to separate comments (treating paragraphs as indivisible)
  3. zero in on just the start of each comment to define the indentation relationship between comments.

It will take a bit more code, but it should well match the reader's interpretation of structure.

@Alsee I'm not sure if your comment was meant as a reply to mine (I was a bit hoping to get one), but to be clear, the key insight you describe is exactly how the current DiscussionTools code already works – except for the last point, where instead of using the start indentation, we use the minimum of start and end indentation. (We can change this easily, but I'd like to hear an opinion on this from more than one person.)

@matmarex what's the rationale for the current approach? I can see several reasons for start and end indent to differ, but I can't think of any case where a differing end indent would be giving the correct signal.

Here are two examples:

First example

This long discussion is just getting confusing.  Could everyone just tell me:  Should we use this source?  I'll start:

* You already know that I think we should use this source.  ~~~~

Second example

:What?  Example (talk) 10:11, 12 January 2020 (UTC)
::Look, it's long and complicated.  User (talk) 13:14, 12 January 2020 (UTC)
:::The first problem is that we don't have the sources.
The second problem is that this page needs to be re-written.  ~~~~

In the first example, the next person to comment should reply on the basis of the final indentation. In the second example, the next person to comment should reply on the basis of the first indentation (as the failure to indent some part of a multi-line comment is 'wrong', albeit common enough that it happens every day of the week on the English Wikipedia (example) ). There is no single answer that will always be right. Software can make a guess, and occasionally it'll have to be corrected manually, just like editors are already correcting the 'wrong' guesses made by the existing scripts (example).

@matmarex what's the rationale for the current approach? I can see several reasons for start and end indent to differ, but I can't think of any case where a differing end indent would be giving the correct signal.

The specific kind of comment I'm thinking of is like this:

Everything is terrible! Articles are broken and discussions are broken. [[User:User1|User1]] 19:45, 5 February 2020 (UTC)
:* Articles are not broken, they work fine for me.
:* Discussions are not broken, you're just using them wrong.
:Please elaborate. [[User:User2|User2]] 19:45, 5 February 2020 (UTC)

User2's comment begins at indent level 2 and ends at indent level 1, and the "correct" indent level is 1.

I have not yet found the time to try to find out how common comments like this are, but I've definitely seen such things on the wikis. I was hoping that after we deploy the new tools to some wikis, we can depend on their feedback to decide whether it's more useful to support this case, or the other case.

Another reason why I'd rather wait with investigating this until after the initial deployment is that we actually already have debugging code for this problem:

			if ( startLevel !== endLevel ) {
				console.log( 'Comment starts and ends with different indentation', startNode, node );
			}

…So, you can just view a discussion page, look at the browser console, and immediately see if there are any funky comments on that page worth investigating.

But of course this only works after the extension is deployed somewhere.

Another reason why I'd rather wait with investigating this until after the initial deployment...

@matmarex would I be correct to understand the "this" you are referring to in the comment above as something like the following?
"Investigate whether the comment parser's algorithm needs to be adjusted to more reliably detect the structure of discussions?

And if so, can I assume that the investigation above would come in response to bug reports where people are noticing the Reply too posting comments at indentation levels they did not expect?

(I want to make sure I'm understand the outcomes of this task correctly)

Yes and yes

Great – thank you for confirming, Bartosz.