Page MenuHomePhabricator

Change over to new talk page API
Closed, ResolvedPublic

Description

It looks like we have a new ActionAPI coming that will be able to handle our Talk Page needs (IDs, time stamps, user names, etc.): https://gerrit.wikimedia.org/r/c/mediawiki/extensions/DiscussionTools/+/711505/

We need to confirm that we can switch from /page/talk API to this new one - we need to confirm that everything we get from the existing API we can get from the new one (and then some - we should also have users, datestamps, ID, etc.). Then switch over to using it.

Event Timeline

LGoto triaged this task as High priority.Feb 14 2022, 5:26 PM

@JTannerWMF @MattCleinman
I've been poring over the new Talk API, and while most things check out, there seem to be a couple of issues:

  • The new API does not seem to give us the topmost "banner" section of the talk page, i.e. the text at the top that is not part of any thread. The old (current) API gives us this section anyway, even though it's not tied to a thread. The options would be (a) Ask whether it's feasible to include the banner text as a "thread" in the new API, or (b) make an additional API call just to get the parsed html of the topmost banner, which wouldn't be the end of the world but would be a bit of a waste, or (c) consider not showing the banner in our interface (up to Jazmin/Josh).

And I also see what might be a bug:

There's a thread titled "Improper Grammar" which has text and a reply, but the API reports it as empty:
https://en.wikipedia.org/w/api.php?action=discussiontoolspageinfo&format=json&page=Talk:Cat&prop=threaditemshtml&formatversion=2

There's also a thread titled "what an absurd title", which has text and one reply, but the API reports only the text contents, and the reply is missing.

Found another possible issue, regarding subscribing to topics:

Looking again at [[Talk:Cat]], when I use the Desktop site to subscribe to the first thread in the page ("Semi-protected edit request..."), and inspect the network request made by the site, it sends a comment id of h-67.218.240.165-2021-12-04T08:18:00.000Z to make the subscription.

However, that's not the same ID as the new API gives us for the same heading in the structure that we receive. The comment id given by the new API is h-Semi-protected_edit_request_on_4_December_2021-2021-12-04T08:18:00.000Z, which does not seem to work for subscribing to that topic.

(unless I'm misunderstanding the usage of the subscription api)

The "Improper Grammar?" and "what an absurd title" issues are actually both an issue with the user "beforeAdapter" -- their signature (beforeAdapter (talk · contribs) 2022-03-20T17:22Z) is breaking the signature policy by using the wrong timestamp format and so their comments aren't being picked up by the parser.

The subscription issue is because we've all managed to miss that we're not providing you with the actual thing you need to send to that API, which is item names rather than IDs.

I've got a patch up in the original task adding a name field, which is what you'll need to send to that API.

(Distinction: an item name is guaranteed to stay the same if an item is moved around -- it doesn't include data about its context, which the id does.)

Hi @DLynch @Dbrant - I have a few questions about the API after looking over some of design's initial hopes and dreams. These aren't necessarily something we need to jump on right away (especially the last question), but I want to get some thoughts on need/difficulty while we're touching this new endpoint. Thanks!

Stripping signatures
Our comment cell mocks have the username and user talk page link in various places instead of inline signatures. The API gives us the username as a separate field, which is perfect, and we can construct the talk link from that and stick it wherever we want. But the API also returns the username and talk link as a signature at the end of every comment object in the html field. Would it be possible to strip that signature on the API side, or is it a bad idea to touch these?

Example mock:

Screen Shot 2022-03-30 at 5.34.46 PM.png (159×263 px, 17 KB)

Current API object:

"html": "And now that I look at it further, it seems likely that the Hippopotomuses were larger.  I think this was just someone adding in misinformation.  Sorry for spreading it. I'll revise it to largest <a rel=\"mw:WikiLink\" href=\"./Primate\" title=\"Primate\" id=\"mwEQ\">primate</a>. - <a rel=\"mw:WikiLink\" href=\"./User:AlexisA\" title=\"User:AlexisA\" id=\"mwEg\">AlexisA</a> (<a rel=\"mw:WikiLink\" href=\"./User_talk:AlexisA\" title=\"User talk:AlexisA\" id=\"mwEw\">talk</a>) 05:19, 30 November 2020 (UTC)"
"timestamp": "2020-11-30T05:19:00.000Z",
"author": "AlexisA"

User redlinks
Related is the ability to show red links for a user page or user talk page:

Screen Shot 2022-03-30 at 5.42.52 PM.png (120×264 px, 13 KB)

We could potentially lose the ability to detect this if we go through with #1 above, because signature links have the "class='new' attribute for redlinks. We would need a couple of new fields so that it ultimately looks like:

"html": "And now that I look at it further, it seems likely that the Hippopotomuses were larger.  I think this was just someone adding in misinformation.  Sorry for spreading it. I'll revise it to largest <a rel=\"mw:WikiLink\" href=\"./Primate\" title=\"Primate\" id=\"mwEQ\">primate</a>."
"timestamp": "2007-11-30T05:19:00.000Z",
"author": "AlexisA",
"authorPageExists": true,
"authorTalkPageExists": true

Edit preview
We have an Edit Source option in our mocks, and I noticed there's a discussiontoolspreview API. The json response doesn't seem to follow the same structure as the threaditemshtml API call though. I'm wondering if it would be possible to have a previewing endpoint that matches the threaditemshtml response, to give the users the ability to test out a full page or section of wikitext and preview how it would look in this new API format.

Would it be possible to strip that signature on the API side, or is it a bad idea to touch these?

Signatures are sort of complicated, because they can be completely arbitrary wikitext. This means that defining exactly what in the comment is part of the signature is hard; the required components are just a link to the user-talk page and a timestamp, and there's users out there who do things like have their signature open with "yours sincerely, <username>" which looks really weird if you strip the username out. There's also a community element of it potentially annoying users to strip their visual customizations entirely, even if we could narrow it down. (More niche, you'll hit particularly weird results once you run into pages where people are doing votes, which tend to be comments that consist entirely of a signature...)

You might want to coordinate with @iamjessklein about signature-related designs -- I know we've scaled back a number of our own plans for more radical reformating because of concerns like this.

On the redlinks front, you'll be thrilled to know that it's not actually required that you link to your user page in your signature, so even without any changes here you can't guarantee knowing whether the user page exists just from this API call.

I'm wondering if it would be possible to have a previewing endpoint that matches the threaditemshtml response, to give the users the ability to test out a full page or section of wikitext and preview how it would look in this new API format.

That does sound like a fairly obscure use-case -- "take this wikitext and show me what the comment parser would make of it". Do you think it's actually going to come up in what the apps will be doing? It's technically entirely possible, if it's required, but any situation where it is required is going to involve fun situations where we start defining exactly which pages (or sections of pages) should be treated as discussions and the circumstances that can change that. :D

For pure preview while writing a single comment, the discussiontoolspreview API should be all you'd need -- anything else in that single comment's data should be already-known to you or contained in the response.

@DLynch

Signatures are sort of complicated, because they can be completely arbitrary wikitext.

That all makes sense - thanks!

On the redlinks front, you'll be thrilled to know that it's not actually required that you link to your user page in your signature, so even without any changes here you can't guarantee knowing whether the user page exists just from this API call.

Good to know. We may be able to add the coloring client-side if we keep the signatures intact in html and can seek out that new attribute, but I think it'll be up to design/product if it's okay that it's not a guarantee. The inconsistency might be a dealbreaker.

Do you think it's actually going to come up in what the apps will be doing?

I'll get more details on this. It may be too early to worry about it, but I think the hope is for there to be an option to edit the wikitext of the full talk page. Essentially Desktop's Edit source > Preview > Publish flow. The preview part is what worries me - how we would display that without the ability to test out arbitrary wikitext through the comment parser.

The preview part is what worries me - how we would display that without the ability to test out arbitrary wikitext through the comment parser.

The interesting problem here is that there will exist cases which will switch the mode in which the page should be displayed. There are pages outside the Talk namespaces which are sometimes discussion pages and sometimes article pages, and which mode you should be previewing in would vary depending on how the wikitext changes. It might be best to just do the regular article preview regardless of the mode, if you want to avoid complex logic.

Pretty sure this is complete.