Page MenuHomePhabricator

More flexible user experience for monitoring Wikidata changes from client, using new filters
Open, Needs TriagePublic

Description

As the WMF's collaboration team pushes forward the new awesome edit review filters there is an opportunity to take advantage of its capabilities,
and make it possible for (client) wikis to monitor wikidata changes on recentchanges/watchlist without becoming too noisy. This can be done easily by letting users filter wikidata edits with more options.

Current status

  • Wikidata has ongoing effort to make less non relevant changes pushed to recentchanges (for example: T151717), even when completed it is not going to address the needs for users to actually show wikidata changes on the client wikis.
  • The new filters support show/hide wikidata (similar to classic recentchanges).

Solution
I suggest to have a new Wikidata filters group with the following sub-options according to the wikidata aspects there were changed: description/statement/label/sitelink.

Example Use cases:

  • description changes - this is highly requested feature (in enwiki), as changes on descriptions affect mobile search and aren't monitored outside Wikidata. By showing only Wikidata changes that change descriptions (hiding labels/sitelinks/statements) users can easily monitor it. It may be valid to show only descriptions in the content-language of the client wiki (bypass for T173144 )
  • sitelinks changes - such changes are usually not interesting and were classically hidden from users (maintained by interwiki bots with bot flag), so it should be possible for a user to hide them.
  • label/statements changes - such changes may affect infoboxes and other data imported from Wikidata.

Suggested implementation

  • implement new filters (ChangesListStringOptionsFilterGroup, ChangesListStringOptionsFilter)
  • The filters should query recentchanges table with rc_source=wb and according to the type of the filter (tricky -see below).
    • I think Wikidata changes aren't actually modeled in the client wiki (it is probably not efficient to query rc_params/rc_comment?)
    • So we should either query the Wikidata Repo (from server side? or maybe client side using RepoApi?)
    • Alternatively we can use abuse tag_summary and use tags to indicate the Wikidata aspects related to a change (just ~4 tags)

Event Timeline

Restricted Application added a subscriber: Aklapper. · View Herald Transcript
Mattflaschen-WMF renamed this task from Monitoring Wikidata changes from client using new filters to More flexible user experience for monitoring Wikidata changes from client, using new filters.Sep 26 2017, 7:51 AM

Perhaps this could be implemented by adding more rc_source possibilities (e.g. 'wb.description', 'wb.sitelinks'). However, this may have implications for T171027: "Read timeout is reached" DBQueryError when trying to load specific users' watchlists (with +1000 articles) on several wikis.

We would need backwards compatibility, but the recentchanges table only lasts 30 days (configurable) anyway, so this is very temporary.

Perhaps this could be implemented by adding more rc_source possibilities (e.g. 'wb.description', 'wb.sitelinks'). However, this may have implications for T171027: "Read timeout is reached" DBQueryError when trying to load specific users' watchlists (with +1000 articles) on several wikis.

We would need backwards compatibility, but the recentchanges table only lasts 30 days (configurable) anyway, so this is very temporary.

I think one edit in Wikidata can change both description, label and sitelink (at least the API expose edit entity), so assuming it is one to many connection, I would stick to tag_summary.

I think one edit in Wikidata can change both description, label and sitelink (at least the API expose edit entity), so assuming it is one to many connection, I would stick to tag_summary.

To solve that, we could evaluate splitting one repository edit (which affected e.g. all of those) into multiple separate client edits (each with only one rc_source, e.g. one for wb.description, one for wb.sitelinks, etc.).

However we implement this, we should evaluate the performance implications so it doesn't worsen T171027 .