Page MenuHomePhabricator

[12hr] Investigate (and fix, if possible) the use of baserevid with MCR and creating a new MediaInfo entity on an existing page
Closed, ResolvedPublic

Description

Currently, users of the Wikimedia Commons action API cannot create new MediaInfo entity on an existing commons page through the Wikibase interface while using the page's baserevid. An investigation is required to determine how to make this possible.

AC:

  • We know where this operation is blocked (Wikibase or WikibaseMediaInfo?).
  • A way forward to implement this functionality is defined
  • A ticket (or tickets) have been created to tackle this topic or, if the dev considers the fix quick enough (meaning it fits in the remaining time in timebox), fix it

Reproduction:

  1. Upload a new file (and do NOT add any structured data)
  2. Use wbsetclaim to add a statement and set the baserevid to the latest page revision
  3. Receive an error

Current Behavior:
The request results in an error:

{
    "errors": [
        {
            "code": "nosuchrevid",
            "text": "Revision with ID not found.",
            "data": {
                "messages": [
                    {
                        "name": "wikibase-api-nosuchrevid",
                        "parameters": [],
                        "html": "Revision with ID not found."
                    }
                ]
            },
            "module": "main"
        }
    ],
    "docref": "See http://localhost/wiki1/api.php for API usage. Subscribe to the mediawiki-api-announce mailing list at <https://lists.wikimedia.org/mailman/listinfo/mediawiki-api-announce> for notice of API deprecations and breaking changes."
}

Expected behavior:
The edit would be allowed, and is successful.

Hints:

  • The issue identified in the ticket is with the wbsetclaim API, but this likely affects ALL APIs when used in this mode.
  • Although the bug is likely in Wikibase or MediaWiki the MediaInfo extension will be greatly beneficial when debugging as it is the only entity type currently setup to work with MCR (Multi-Content Revision).

Reading:

Original Note:

In T263298 one of the issue at play is that you can't provide a baserevid to Wikibase API endpoints to create a mediainfo entity, if it didn't already exist.
This is discovered and confirmed in T263298#6494034

When a file on commons does not get have a media info entity attached to it, you can not use a baserevid in Wikibase API calls
This is a small issue with the MCR integration that someone will need to tackle for the edit conflict rules to be able to work in all cases.

Timeboxed to 12 hours

Event Timeline

One comment that came up in a WMDE meeting was: should you even be able to call setclaim on an entity that doesn't yet exist?

One comment that came up in a WMDE meeting was: should you even be able to call setclaim on an entity that doesn't yet exist?

I don’t think disallowing this will help anyone… people would have to adjust their code from “check if it’s a new entity, if yes omit the baserevid” to “check if it’s a new entity, if yes use a completely separate API action to make effectively the same edit”.

Put another way: if as an “end user” (/ developer) I want to add a caption to a file, why should I have to care whether the “entity part” of the file already exists or not? The UI even shows captions and statements as separate tabs, with captions closer to the wikitext file description than the statements, but the existence of statements is what might determine whether the “entity part” exists or not.

One comment that came up in a WMDE meeting was: should you even be able to call setclaim on an entity that doesn't yet exist?

So this was a decision (probably not recorded anywhere) about how Wikibase was to work with MCR and mediainfo
The main phrasing would probably be something along the lines of...

If a page exists that is going to have a mediainfo entity attached to it, we can pretend that that entity already exists, just is in an empty state (as the ID is already known).
If we ever use MCR with Wikibase and another entity type that doesn;t have a known ID then we may need to revisit this decision.
I seem to remember that this is connected to the fact that MediaInfo will always be able to lookup the entity ID if the page exists, and as the ID is known, wikibase makes assumptions that the entity exists.
Compared to Items, if Q123 is known internally, then the item definitely exists and can be used with various services etc.

ItamarWMDE updated the task description. (Show Details)
ItamarWMDE updated the task description. (Show Details)
darthmon_wmde renamed this task from Investigate the use of baserevid with MCR and creating a new MediaInfo entity on an existing page to Investigate (and fix, if possible) the use of baserevid with MCR and creating a new MediaInfo entity on an existing page.Feb 3 2021, 1:14 PM
darthmon_wmde updated the task description. (Show Details)
noarave renamed this task from Investigate (and fix, if possible) the use of baserevid with MCR and creating a new MediaInfo entity on an existing page to [12hr] Investigate (and fix, if possible) the use of baserevid with MCR and creating a new MediaInfo entity on an existing page.Feb 3 2021, 1:59 PM

Change 661956 had a related patch set uploaded (by Lucas Werkmeister (WMDE); owner: Lucas Werkmeister (WMDE)):
[mediawiki/extensions/Wikibase@master] Simplify and fix WikiPageEntityMetaDataLookup::selectRevisionInformationById()

https://gerrit.wikimedia.org/r/661956

Change 661956 had a related patch set uploaded (by Lucas Werkmeister (WMDE); owner: Lucas Werkmeister (WMDE)):
[mediawiki/extensions/Wikibase@master] Simplify and fix WikiPageEntityMetaDataLookup::selectRevisionInformationById()

https://gerrit.wikimedia.org/r/661956

Note that this change does not fix this issue – it’s just something that I noticed while trying to figure out how to fix this. I hope it’s an improvement, but I still need to figure out how to actually fix this.

Change 662734 had a related patch set uploaded (by Lucas Werkmeister (WMDE); owner: Lucas Werkmeister (WMDE)):
[mediawiki/extensions/Wikibase@master] Use EntityTitleStoreLookup in ResultBuilder

https://gerrit.wikimedia.org/r/662734

Change 662735 had a related patch set uploaded (by Lucas Werkmeister (WMDE); owner: Lucas Werkmeister (WMDE)):
[mediawiki/extensions/Wikibase@master] Tolerate some missing revisions in EntityLoadingHelper

https://gerrit.wikimedia.org/r/662735

Change 662736 had a related patch set uploaded (by Lucas Werkmeister (WMDE); owner: Lucas Werkmeister (WMDE)):
[mediawiki/extensions/WikibaseMediaInfo@master] Test that baserevid can be specified with API requests

https://gerrit.wikimedia.org/r/662736

Change 662741 had a related patch set uploaded (by Lucas Werkmeister (WMDE); owner: Lucas Werkmeister (WMDE)):
[mediawiki/extensions/WikibaseLexeme@master] Pass EntityTitleStoreLookup into ResultBuilder

https://gerrit.wikimedia.org/r/662741

Change 662741 merged by jenkins-bot:
[mediawiki/extensions/WikibaseLexeme@master] Pass EntityTitleStoreLookup into ResultBuilder

https://gerrit.wikimedia.org/r/662741

Change 662734 merged by jenkins-bot:
[mediawiki/extensions/Wikibase@master] Use EntityTitleStoreLookup in ResultBuilder

https://gerrit.wikimedia.org/r/662734

Change 664583 had a related patch set uploaded (by Lucas Werkmeister (WMDE); owner: Lucas Werkmeister (WMDE)):
[mediawiki/extensions/Wikibase@master] WIP: Look up revisions with no entity slot as missing revisions

https://gerrit.wikimedia.org/r/664583

Change 666530 had a related patch set uploaded (by WMDE-leszek; owner: WMDE-leszek):
[mediawiki/extensions/WikibaseMediaInfo@master] Added tests for, currently failing, adding statement scenario

https://gerrit.wikimedia.org/r/666530

Change 666529 had a related patch set uploaded (by Addshore; owner: WMDE-leszek):
[mediawiki/extensions/Wikibase@master] Removed duplicate entity revision check from EntitySavingHelper

https://gerrit.wikimedia.org/r/666529

Change 661956 merged by jenkins-bot:
[mediawiki/extensions/Wikibase@master] Simplify and fix WikiPageEntityMetaDataLookup::selectRevisionInformationById()

https://gerrit.wikimedia.org/r/661956

Per @Addshore's comment on https://gerrit.wikimedia.org/r/c/mediawiki/extensions/Wikibase/+/662735/ the suggested approach seems to be to have the relevant logic added/changed in a different place than Entity Revision Lookup(s)

This case is only important in cases where the baserevid comes from a user (such as via the api). many entity revision lookups need not concern themselves with this additional check.

The approach explored in https://gerrit.wikimedia.org/r/c/mediawiki/extensions/Wikibase/+/662735/, or alternative variant of it in https://gerrit.wikimedia.org/r/c/mediawiki/extensions/WikibaseMediaInfo/+/666531, do not seem like a suggested path forward.

Given the timeboxed time of this investigation has been used up, I call this investigation done.
Assigning to @Addshore as he was recently having some ideas how to organize investigation + follow up work coming of it ticket wise.

I'v gone ahead and hit +2 on a patch that will fix the issue and a test to go along with it.

https://gerrit.wikimedia.org/r/c/mediawiki/extensions/Wikibase/+/662735 & https://gerrit.wikimedia.org/r/c/mediawiki/extensions/WikibaseMediaInfo/+/666530

Trying to extract some things from a little discussion we had on mattermost yesterday relevant to the decisions around the approach

Leszek: and any word on having this thing in wikibase wikibase and not wikibase media info?
Adam: well, this applies to MCR, not specifically to mediainfo IMO, so wikibase is probably the right place
Adam: Wikibase is what deals with the MCR support
Leszek: noted, thanks for elaboration (sorry for the late reply - unreliable life schedule)

Lucas: hm, so I’m the only one left who’s not happy with that change (my own)?
Leszek: I am not over the moon about it but for different reasons that you mentioned
Adam: which comments in the CR relate to the open concerns?
Lucas: not sure if I left that in a CR comment at some point, but I’m unhappy with that change because it reconstructs the information “does the revision match the entity ID” almost from scratch, when at an earlier point in the EntityRevisionLookup we already have most of that logic; that’s the motivation for Icd91dc1cb5
Adam: right, so looking at that change my gut would be to stick with this other one as it is done, easy to follow and we are talking about a low traffic occurrence anyway.
Most if not all of the things loaded in https://gerrit.wikimedia.org/r/c/mediawiki/extensions/Wikibase/+/662735/6/repo/includes/Api/EntityLoadingHelper.php#288 should be cached in some way (revision lookup)
Lucas: okay
Adam: stares at https://gerrit.wikimedia.org/r/c/mediawiki/extensions/Wikibase/+/664583 a bit more

Change 662735 merged by jenkins-bot:
[mediawiki/extensions/Wikibase@master] Tolerate some missing revisions in EntityLoadingHelper

https://gerrit.wikimedia.org/r/662735

Change 666530 merged by jenkins-bot:
[mediawiki/extensions/WikibaseMediaInfo@master] Added tests for, currently failing, adding statement scenario

https://gerrit.wikimedia.org/r/666530

Change 664583 abandoned by Lucas Werkmeister (WMDE):
[mediawiki/extensions/Wikibase@master] WIP: Look up revisions with no entity slot as missing revisions

Reason:

https://gerrit.wikimedia.org/r/664583