Page MenuHomePhabricator

Show entity labels in parsed edit summaries in API requests as well
Closed, ResolvedPublic5 Estimated Story Points

Description

As a Wikidata editor, I want to see “nice” edit summaries when edits are being shown outside of wikidata.org (e.g. in tools), in order to understand what the edit is doing.

Problem:
Wikibase formats links in edit summaries using the label of the linked entity, e.g. [[Property:P460]]: [[Q112795079]] becomes “said to be the same as (P460): Wikidata Sandbox 4 (Q112795079)”. However, this is disabled in API requests, originally to fix T111346; that bug should since have been fixed via T292203, though, so we should be able to enable these links in API requests now without risking a regression on T111346.

Example:
Getting the edit summary of this edit using this API request currently produces:

<span dir="auto"><span class="autocomment">Changed claim: </span></span> <a href="/wiki/Property:P460" title="Property:P460">Property:P460</a>: <a href="/wiki/Q112795079" title="Q112795079">Q112795079</a>

And would change to:

<span dir="auto"><span class="autocomment">Changed claim: </span></span> <a href="/wiki/Property:P460" title="‎said to be the same as‎ | ‎this item is said to be the same as that item, but it's uncertain or disputed‎"><span class="wb-itemlink"><span class="wb-itemlink-label" dir="ltr" lang="en">said to be the same as</span> <span class="wb-itemlink-id">(P460)</span></span></a>: <a href="/wiki/Q112795079" title="‎Wikidata Sandbox 4‎ | ‎sandbox for testing‎"><span class="wb-itemlink"><span class="wb-itemlink-label" dir="ltr" lang="en">Wikidata Sandbox 4</span> <span class="wb-itemlink-id">(Q112795079)</span></span></a>

Implementation note:
This is currently checked in HtmlPageLinkRendererEndHookHandler::doHtmlPageLinkRendererEnd(); see the call to $this->isApiRequest().

Screenshots/mockups:

BDD
GIVEN an edit with wikilinks to items or properties in the edit summary (without custom link text)
WHEN loading the parsed comment via the API
THEN the links are shown using the entity label

Acceptance criteria:

  • API requests that return edit summaries show formatted links with entity labels. This includes action=query + prop=revisions + rvprop=parsedcomment, as linked above, but also the Echo API (see T116762).
  • There is a temporary feature flag that allows us to deploy this to Test Wikidata separately to Wikidata.

Open questions:

  • Does this have any negative effects?
    • We assume that users who analyze edit summaries to determine what changed in an edit generally use the unparsed comment (e.g. /* wbsetclaim-update:2||1|1 */ [[Property:P460]]: [[Q112795079]]), which is unaffected.
NOTE: This is a significant change and will be announced in accordance with the Stable Interface Policy. Announcement is being prepared here.

Event Timeline

Task Breakdown Notes

  • One code change will affect all related API requests including Echo API, no need to create sub-tasks for Echo API

Potential Plan of Action

  • Add a config for a feature flag to switch this change on Test Wikidata
  • Use this new feature flag to determine if isApiRequest() should be checked in HtmlPageLinkRendererEndHookHandler::doHtmlPageLinkRendererEnd()
  • Create a sub-task for enabling new feature flag on Test Wikidata after all subtasks are done in T330130 (TBC @ItamarWMDE)
  • Enable feature flag on Wikidata (TBC @Michael)
  • Remove feature flag from code (TBC @HasanAkgun_WMDE)

Just disabling the check would reintroduce T111346 for the same reason it origially worked:

The hook got run because when editing a wiki page and focusing the
summary field before saving, the stashedit api gets called which
already parses the wikitext and puts it into the parser cache.
Because the api sets $wgTitle to Special:Badtitle/dummy_title_for_
API_calls_set_in_api.php we are technically on a special page and
thus the hook gets run.

Quoted @Bene from T111346#1642128.

So while T292203 did improve the situation, we need to do something smarter.
Also, I note that just removing the MW_API check and thus reintroducing the bug, did not fail any tests. We should probably also fix that.

The current logic is:
"Add Labels iff (!$this->isApiRequest()) && ($linkRenderer->isForComment() || $currentTitle->isSpecialPage())"

Dropping the check for the API Request doesn't work, because it introduces the bug:
"Add Labels iff $linkRenderer->isForComment() || $currentTitle->isSpecialPage()"

But maybe we could modify the logic in way that get's all that we want:
"Add Labels iff $linkRenderer->isForComment() || (!$this->isApiRequest() && $currentTitle->isSpecialPage())"

Curious if @Lucas_Werkmeister_WMDE has thoughts on this. For now, I'll think about how to add a regression test for T111346 as well as for the desired behavior of this task.

What if we change the special page condition to $currentTitle->isSpecialPage() && !$currentTitle->isSpecial( 'Badtitle' ), i.e. treat Special:Badtitle like a non-special page? It’s not like we ever expect users to see real links on that title when it’s used legitimately.

By the way, one interesting consequence of the current conditions is that the links transcluded from Special:AllPages into a wikitext page – e.g. {{Special:AllPages/Item:Q216155}} – will be shown with labels. But at least when I test this on my Test Wikidata user sandbox (permalink), it always shows labels in my current user language, i.e. the main problem from T111346 – that you might see links in a random other user’s language – isn’t happening. So I’m not sure this is even a bug.

Change 910553 had a related patch set uploaded (by Michael Große; author: Michael Große):

[mediawiki/extensions/Wikibase@master] Add regression test for parser cache pollution (T111346)

https://gerrit.wikimedia.org/r/910553

What if we change the special page condition to $currentTitle->isSpecialPage() && !$currentTitle->isSpecial( 'Badtitle' ), i.e. treat Special:Badtitle like a non-special page? It’s not like we ever expect users to see real links on that title when it’s used legitimately.

Mh, that makes sense to me. It's worth trying out 🤔

Change 910432 had a related patch set uploaded (by Michael Große; author: Michael Große):

[mediawiki/extensions/Wikibase@master] Show entity labels in parsed edit summaries in API requests

https://gerrit.wikimedia.org/r/910432

Change 910553 merged by jenkins-bot:

[mediawiki/extensions/Wikibase@master] Add regression test for parser cache pollution (T111346)

https://gerrit.wikimedia.org/r/910553

Change 910432 merged by jenkins-bot:

[mediawiki/extensions/Wikibase@master] Show entity labels in parsed edit summaries in API requests

https://gerrit.wikimedia.org/r/910432

Change 911311 had a related patch set uploaded (by Michael Große; author: Michael Große):

[operations/mediawiki-config@master] Beta-Wikidata: Enable Labels in Wikidata edit summaries

https://gerrit.wikimedia.org/r/911311

The change to beta should make this sufficiently testable for the API endpoint. However, this is not going to affect the Echo notifications (as requested in the AC), because for that those plain titles would have to change to actual links, and that, as I understand it, was abandoned as unviable: 870732: Change notification page names to links.

That being said, it might be possible to turn \MediaWiki\Extension\Notifications\Formatters\EchoEventPresentationModel::getTruncatedTitleText() into using a hook or something, so that Wikibase (or EntitySchema...) can modify the title text instead of just always using \MediaWiki\Title\Title::getPrefixedText().

But that would probably be its own story.

@Michael yes! The hook for that purpose is what I'm requesting here: T334803: Create PageTitle hook

Change 911311 merged by jenkins-bot:

[operations/mediawiki-config@master] Beta-Wikidata: Enable Labels in Wikidata edit summaries

https://gerrit.wikimedia.org/r/911311

Mentioned in SAL (#wikimedia-operations) [2023-04-26T07:20:51Z] <taavi@deploy1002> Started scap: Backport for [[gerrit:911311|Beta-Wikidata: Enable Labels in Wikidata edit summaries (T327062)]]

Mentioned in SAL (#wikimedia-operations) [2023-04-26T07:22:19Z] <taavi@deploy1002> taavi and migr: Backport for [[gerrit:911311|Beta-Wikidata: Enable Labels in Wikidata edit summaries (T327062)]] synced to the testservers: mwdebug1001.eqiad.wmnet, mwdebug2001.codfw.wmnet, mwdebug2002.codfw.wmnet, mwdebug1002.eqiad.wmnet

Mentioned in SAL (#wikimedia-operations) [2023-04-26T07:28:40Z] <taavi@deploy1002> Finished scap: Backport for [[gerrit:911311|Beta-Wikidata: Enable Labels in Wikidata edit summaries (T327062)]] (duration: 07m 48s)

One strange thing that we should maybe look into: while this works in prop=revisions as in Michael’s URL above, it doesn’t seem to work in list=recentchanges and I have no idea why: https://test.wikidata.org/w/api.php?action=query&format=json&list=recentchanges&formatversion=2&rcnamespace=0&rcprop=parsedcomment

Thank you both, this should simplify the EditGroups tool quite a bit! Here is an example of page where I had to render entity links manually, with some Javascript post-processing code:
https://editgroups.toolforge.org/b/CB/3683873dde8d/. (I guess I will keep this logic for a while, for all the revisions fetched already, but it could be dropped in some years)

Happy to make a difference 😊


About this not working with list=recentchanges, strange indeed, but maybe also good, because list=recentchanges was not mentioned in the announcement? Or is the behavior actually expected to be the same and this is thus a bug?

(Also pinging @Lydia_Pintscher and @Arian_Bozorg for a product opinion about bad this is or if it should be treated separately, as an enhancement or as a bug.)

I looked into it a tiny bit, and it turns out the cause is that there are two different services in play.

prop=revisions uses CommentFormatter in https://gerrit.wikimedia.org/r/plugins/gitiles/mediawiki/core/+/refs/heads/master/includes/ServiceWiring.php#413
but
list=recentchanges uses RowCommentFormatter in https://gerrit.wikimedia.org/r/plugins/gitiles/mediawiki/core/+/refs/heads/master/includes/ServiceWiring.php#1741

The latter does not have the [ 'renderForComment' => true ] option for the LinkRendererFactory.

But apart from that, it seems the CommentParserFactory in both is identical? That should probably be extracted to prevent such issues in the future.

(This by now looks very much like a bug to me.)

If it's not too difficult, it would be helpful to have that behavior in list=recentchanges too, typically for tools which do recent changes polling.
(EditGroups sadly uses the EventStreams service and not RC polling directly - not sure if changing list=recentchanges would also impact the EventStreams as well?)

Thanks for checking! In that case I agree it sounds like a bug (probably my fault, when I introduced renderForComment and missed one of the comment formatter services?) that we should probably fix. (But also happy to have a product opinion on this.)

Let's see if that is as straight-forward to fix as I assume that it is.

Change 916420 had a related patch set uploaded (by Michael Große; author: Michael Große):

[mediawiki/core@master] Fix missing Entity Labels in list=recentchanges

https://gerrit.wikimedia.org/r/916420

Change 916420 merged by jenkins-bot:

[mediawiki/core@master] Fix missing Entity Labels in list=recentchanges

https://gerrit.wikimedia.org/r/916420

hoo subscribed.

AFAICT this is done (and the remaining open subtask is also in verification already).

Looks good! Thanks everyone :)