Page MenuHomePhabricator

Preview window of Wikimedia Commons reconciliation service's data extension service produces error messages
Open, HighPublicBUG REPORT

Assigned To
Authored By
Spinster
Nov 23 2021, 1:52 PM
Referenced Files
F34770160: phab0.JPG
Nov 26 2021, 5:24 PM
F34770165: phab6.JPG
Nov 26 2021, 5:24 PM
F34770164: phab3.JPG
Nov 26 2021, 5:24 PM
F34770173: phab7.JPG
Nov 26 2021, 5:24 PM
F34770162: phab2.JPG
Nov 26 2021, 5:24 PM
F34770161: phab1.JPG
Nov 26 2021, 5:24 PM
F34770163: phab5.JPG
Nov 26 2021, 5:24 PM
F34763305: image.png
Nov 23 2021, 1:52 PM
Tokens
"Like" token, awarded by Eugene233.

Description

List of steps to reproduce (step by step, including full links if applicable):

  • Reconcile a list of Commons file names in OpenRefine
  • Start the process of creating additional data columns through the reconciliation service's data extension (Edit Column > Add columns from reconciled values...)

I worked with this dataset:

What happens?:

image.png (166×805 px, 24 KB)
image.png (179×804 px, 24 KB)

Requesting Wikitext and the Creator (P170) property values: in both cases, the preview window simply says 'Error.'

What should have happened instead?:

There should have been a visible preview of the wikitext for a selection of the first files - not an error message.

Interestingly, when ignoring the error message and clicking 'OK' anyway, data will be extended and the requested columns will be produced.

Software version (if not a Wikimedia wiki), browser information, screenshots, other information, etc:

OpenRefine 3.5.0 in Chrome 96.0.4664.55 on Mac OS 10.13.6

Event Timeline

Similar issue with a different dataset and using the wikidata recon service:

phab0.JPG (681×1 px, 85 KB)

phab1.JPG (814×1 px, 127 KB)

phab2.JPG (815×1 px, 115 KB)

phab5.JPG (824×1 px, 162 KB)

phab3.JPG (817×1 px, 145 KB)

phab6.JPG (661×1 px, 188 KB)

phab7.JPG (808×1 px, 190 KB)

Let's make progress on this one!
We could set up a downtime notifier to periodically send data extension queries to the service and check that the response is right.
This could be done via https://www.downnotifier.com/ for instance.
@Eugene233 could you set that up? We can then review the results in some weeks.

Spinster lowered the priority of this task from High to Medium.Jan 5 2022, 4:00 PM
Spinster raised the priority of this task from Medium to High.Jan 26 2022, 8:40 AM

Let's indeed set up a downtime notifier during this sprint (Jan 24-Feb 11) so that we can track the reliability of the reconciliation service.

Is https://www.downnotifier.com a trusted/good service? I see that it has a premium plan for $14.95 per year. Should we research alternatives too? If it helps, I can pay for such a premium service out of our budget. Let me know if you think that would be helpful @Eugene233 @Pintoch

The free plan should be enough.

Downnotifier is set up (with my private email address, so that I can check it easily).

It has been given the following URL - is that a good one?

https://commonsreconcile.toolforge.org/en/api?extend=%7B%22ids%22%3A%5B%22M32709744%22%5D%2C%22properties%22%3A%5B%7B%22id%22%3A%22wikitext%22%7D%5D%7D

Until now (since setting up the notifier, a few weeks ago) uptime has been 100%. However, I still occasionally get the 'error' message. It seems to appear inconsistently, but especially in the following cases:

  • larger datasets where it takes a while to compute the previews
  • and/or (often in combination with the previous) when selecting both Wikitext and one or two properties at the same time

Can we catch such occurrences via downnotifier.com by feeding it a more complex (demanding) URL for instance? If so, what should it look like?

After investigating, I think it could be a limit issue with Wikimedia Commons pageids parameter for the query. So I am trying to do the query in batches respecting the limit.

The error I get indicates this:

{'error': {'code': 'toomanyvalues', 'info': 'Too many values supplied for parameter "ids". The limit is 50.', 'limit': 50, 'lowlimit': 50, 'highlimit': 500, '*': 'See https://commons.wikimedia.org/w/api.php for API usage. Subscribe to the mediawiki-api-announce mailing list at <https://lists.wikimedia.org/postorius/lists/mediawiki-api-announce.lists.wikimedia.org/> for notice of API deprecations and breaking changes.'}, 'servedby': 'mw1390'}

Change 769409 had a related patch set uploaded (by Eugene233; author: Eugene233):

[labs/tools/commons-recon-service@main] Preview window of Wikimedia Commons reconciliation service's data extension service produces error messages

https://gerrit.wikimedia.org/r/769409

Change 769409 merged by jenkins-bot:

[labs/tools/commons-recon-service@main] Preview window of Wikimedia Commons reconciliation service's data extension service produces error messages

https://gerrit.wikimedia.org/r/769409

@Spinster Can we close this task now that the change is merged?

An updated from user testing sessions in June (see https://github.com/OpenRefine/CommonsExtension/issues/28 ): This comment from above – "Interestingly, when ignoring the error message and clicking 'OK' anyway, data will be extended and the requested columns will be produced." actually does not work anymore. If an error is generated, the data extension won't work, so further fixes are required on this bug.