Page MenuHomePhabricator

Use narrow API calls instead of wbeditentity in Pywikibot
Open, Needs TriagePublic

Description

pywikibot currently uses editEntity as a wrapper around almost all methods related to editing in Wikidata and it calls wbeditentity. @thiemowmde told us it's better to use narrow API calls like wbsetdescription, etc. over wbeditentity.

Event Timeline

Ladsgroup raised the priority of this task from to Needs Triage.
Ladsgroup updated the task description. (Show Details)
Ladsgroup added subscribers: Ladsgroup, thiemowmde, jayvdb.
jayvdb added a subscriber: Lydia_Pintscher.

Waiting on explanation from Wikidata devs.

I'm a Wikidata dev. What do you expect upstream to do?

The wbeditentity API is, in essence, a fallback you can use when there is no specific API for a specific use case. There are several disadvantages in using wbeditentity, namely:

  • The edit summary will, in most cases, be an unspecific "the entity changed" because that's all the API module knows. Remember, the API module gets a big JSON blob of the full entity. Sure, in theory it's possible to create a diff and traverse it and create more specific summary lines if, for example, only a label changed. But that's (currently) not the job of wbeditentity. I would love to implement some magic that switches to more specific summary lines if possible (this is tracked in T67846), but even that will stop working if, for example, a label and a statement's value are changed in a single wbeditentity call.
  • Submitting a big JSON blob when, for example, only a label changed, wastes computational power and bandwith on all sides.

I suggest to use wbsetlabel, wbsetdescription and wbsetaliases if only a single label, description or set of aliases (in one language) changed. I suggest to use wbsetsitelink if only a single sitelink changed. I suggest to use wbsetclaim if only a single statement changed. These are the most important, most relevant changes that result in the most bot-triggered summary lines. Switching to the more specific API modules would make these edits much, much more visible to all users.

Yes it is better to use the more specific API actions where possible.

Thanks for replying. That is what I was hoping for.

More specific API actions are perfect if only one edit is needed. However that is almost always not the case. If more than one change is needed, such as adding a statement and adding a source to that statement, that change should be done as a single edit in order to be atomic. If the network fails after an API to add the statement, the source will not be added. Performing multiple changes in a single edit makes the edit history much more readable for humans.

Regarding 'big JSON blob', pywikibot does create a 'diff' to send to the wbeditentity API, and the wbeditentity API does support that. Sure there is some processing required on both sides, however the same 'edits' are sent with roughly the same amount of data pushed across the wire, with less round trips and overhead.

The automatic summaries are problem, but hopefully that is solved by the repo software soon. If not, we can solve that on the client side. However bots should be using informative edit summaries.

I don't think we'll be able to dedicate time soon to providing edit summaries for more types of edits.

Even if this were implemented in Pywikibot, there'd still be lots of clients without this feature.
For example, the Wikibase GUI itself could allow editing multiple values at once and provide a meaningful summary.

Change 368171 had a related patch set uploaded (by Zoranzoki21; owner: Danmichaelo):
[pywikibot/core@master] [FEAT] Use wbsetlabel and wbsetdescription

https://gerrit.wikimedia.org/r/368171