Page MenuHomePhabricator

Graph not displayed if linked to a wikidata query
Closed, ResolvedPublic

Description

Hello, I'd like to trace the fact that the graph are no longer working since few days.
If graph is linked to raw data, it will show OK. (like there Original MediaWiki Template )

If graph is linked to a wikidata query (like other examples in the same page Original MediaWiki Template ) , it won't show [if you hit preview, you see the graph, graph being disappearing if wikipage saved].

The problem is reproduced whichever wiki language.

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes

vegaErr: Error: Load failed with response code 403. -- Vega attempts to call Wikidata API to get the needed data, and I suspect that API returns 403. I would look at the HTTP request Vega makes (it should be very similar to the query stored in the graph on the wiki page), and try to find it in WDQS logs. Perhaps WDQS now blocks some HTTP requests that do not appear to originate from the browser (i.e. have fewer headers than expected)?

Can we verify that Vega sets proper user agent (and by that I mean some string that identifies it) when sending queries to WDQS?

Hello @Yurik, is there any update on your side? Thanks :)

@Lea_Lacroix_WMDE not from my side - I'm a bit overbooked at the moment with my main job (elastic.co) and family. It will take me some effort to get the system running again on my laptop to see what Graphoid sends to the servers. It might be easier to track it from the server side logs if anyone has that access.

@Yurik Thanks for your answer. We can start by investigating on the user-agent header issue possibility. I created [[ T229236 | a ticket ]].

Change 526442 had a related patch set uploaded (by Ladsgroup; owner: Ladsgroup):
[mediawiki/services/graphoid@master] Improve user agent

https://gerrit.wikimedia.org/r/526442

I looked at webrequest for four hours around the time of Petr's post, and I couldn't see any 403s to wikidata.org/w/api.php. If someone could know when the error would show up, you could find it in the webrequest table very easily:

select *
  from wmf.webrequest
 where uri_host = 'www.wikidata.org'
   and uri_path = '/w/api.php'
   and http_status = '403'
   and year=2019 and month=7 and day=24 and hour in (16, 17, 18, 19)
 limit 200;

@Milimetric I believe graphoid will not go via varnish, but directly to an app server, so no surprise you didn't find the request.

Ok, I went through kafka stream and this is what graphoid is calling in terms of MW API:

curl 'https://mediawiki.org/w/api.php?action=graph&format=json&hash=1d5fea924d176cd28d92d2c76a7084a5e83c9563&title=Template:Graph:Lines&formatversion=2'

which returns some JSON with the following URI:

wikidatasparql:///?query=SELECT%20%3Fdecade%20%28COUNT%28%3Fdecade%29%20AS%20%3Fcount%29%20WHERE%20%7B%0A%20%20%3Fitem%20wdt%3AP31%20wd%3AQ3305213%20.%0A%20%20%3Fitem%20wdt%3AP571%20%3Finception%20.%0A%20%20BIND%28%20year%28%3Finception%29%20as%20%3Fyear%20%29.%20%0A%20%20BIND%28%20ROUND%28%3Fyear%2F10%29%2A10%20as%20%3Fdecade%20%29%20.%0A%20%20FILTER%28%20%3Fyear%20%3E%201400%29%0A%7D%20GROUP%20BY%20%3Fdecade%20ORDER%20BY%20%3Fdecade

@Pchelolo Graphoid first calls the action=graph to get the data, but then it should also call to the WDQS directly using that query. Also, you can see what that request looks like if you go to the wiki page with a graph, click edit source, and do a page preview -- your browser should make very similar request to WDQS, except that unlike Graphoid, browser forces a few headers like user agent (IIRC)

TL;DR; the user agent is not set, it just shows up as - so that's what WDQS sees. From what @Smalyshev says above, this is what's causing the 403s, right?

Queries to find this: doh, my bad, of course these are on the query.wikidata.org domain, so:

 select *
   from wmf.webrequest
  where uri_host = 'query.wikidata.org'
    and http_status = '403'
    and uri_query like '?query=%decade%'
    and year=2019 and month=7 and day=24 and hour in (16, 17, 18, 19)
    and webrequest_source='text'
    limit 200
;

Which shows me 14 requests, I'm assuming the ones made above for testing. The user agent didn't get parsed, so I sudo -u analytics and look in wmf_raw.webrequest for just hour 18 which had the most hits:

ADD JAR /usr/lib/hive-hcatalog/share/hcatalog/hive-hcatalog-core.jar;

 select *
   from wmf_raw.webrequest
  where uri_host = 'query.wikidata.org'
    and http_status = '403'
    and uri_query like '?query=%decade%'
    and year=2019 and month=7 and day=24 and hour = 18
    and webrequest_source='text'
    limit 200
;

the user agent is not set, it just shows up as - so that's what WDQS sees

In this case you'd get 403, yes. This needs to be fixed.

Vega seems to allow you to control headers via the dataHeaders property, but it's not really documented, I found it here: https://github.com/vega/vega/blob/af5cc1df42eb5aaf2f478d0bda69313643fe0532/docs/releases/v1.5.4/vega.js#L378

If that's right, then I guess the code to update would be initVega for v1 and v2:

@Yurik does that sound right? I've no time to babysit a patch, but it seems easy to try/test.

Vega1 doesn't need it - it doesn't support external URLs. Vega2 approach sounds correct. Thx for digging into it!

Hi there, it looks like the problem has become general : even graph with raw data no longer appear to work. Like there : https://www.mediawiki.org/wiki/Template:Graph:Lines

This has come up at https://en.wikipedia.org/wiki/Wikipedia:Village_pump_(technical)#Problem_with_Template:Graph:Population_history and seems to be related to the wikidata query. That template at https://www.mediawiki.org/wiki/Template:Graph:Lines does a wikidata lookup from the documentation page. Though once it has done it in preview the page seems to remember the results.

This issue impacts hundreds of pages on many wikis. Should we remove these templates and mark them as obsolete or could we hope for a fix? Thanks.

@Ayack As far as I understand, Graphoid currently doesn't have a maintainer (see T211881). If no one is willing to take over, I think the option of marking it as obsolete and stop using it should be considered.

I believe it's an important issue : an encyclopedia without graphs... well ... plus it contributes to reducing interest on Wikidata

It would be a pity to have no way of creating Wikidata based graphs for the other Projects.

For what it is worth, WMF with some support from WMDE are looking into options on the graph rendering. Changes will take some time, so we need to ask you all for a bit more patience, but we're with you on that one: some solution is needed, there is not intention to just keep like it is now.

Please avoid adding "me too" / +1 comments which don't help to track down or solve the technical problem. Thanks a lot!

I just saw that it must have been fixed. Good job, whoever that might be! Thanks!

I just saw that it must have been fixed. Good job, whoever that might be! Thanks!

Unfortunately, it's not the case. Where have you seen a change?

I just saw that it must have been fixed. Good job, whoever that might be! Thanks!

Unfortunately, it's not the case. Where have you seen a change?

In an article at the Greek wikipedia. Here you go:
https://el.wikipedia.org/wiki/%CE%86%CE%BD%CF%89_%CE%94%CE%BF%CE%BB%CE%B9%CE%B1%CE%BD%CE%AC_%CE%91%CF%81%CE%BA%CE%B1%CE%B4%CE%AF%CE%B1%CF%82#%CE%94%CE%B7%CE%BC%CE%BF%CE%B3%CF%81%CE%B1%CF%86%CE%B9%CE%BA%CE%AE_%CE%B5%CE%BE%CE%AD%CE%BB%CE%B9%CE%BE%CE%B7

Looks the graph has been build with a Lua code and a module here :
https://el.wikipedia.org/wiki/Module:Graph:Population_history

Le lun. 4 nov. 2019 à 16:40, Arkas <no-reply@phabricator.wikimedia.org> a
écrit :

Arkas added a comment. View Task
https://phabricator.wikimedia.org/T226250

In T226250#5630194 https://phabricator.wikimedia.org/T226250#5630194,
@Ayack https://phabricator.wikimedia.org/p/Ayack/ wrote:

In T226250#5628629 https://phabricator.wikimedia.org/T226250#5628629,
@Arkas https://phabricator.wikimedia.org/p/Arkas/ wrote:

I just saw that it must have been fixed. Good job, whoever that might be!
Thanks!

Unfortunately, it's not the case. Where have you seen a change?

In an article at the Greek wikipedia. Here you go:

https://el.wikipedia.org/wiki/%CE%86%CE%BD%CF%89_%CE%94%CE%BF%CE%BB%CE%B9%CE%B1%CE%BD%CE%AC_%CE%91%CF%81%CE%BA%CE%B1%CE%B4%CE%AF%CE%B1%CF%82#%CE%94%CE%B7%CE%BC%CE%BF%CE%B3%CF%81%CE%B1%CF%86%CE%B9%CE%BA%CE%AE_%CE%B5%CE%BE%CE%AD%CE%BB%CE%B9%CE%BE%CE%B7

*TASK DETAIL*
https://phabricator.wikimedia.org/T226250

*EMAIL PREFERENCES*
https://phabricator.wikimedia.org/settings/panel/emailpreferences/

*To: *Arkas
*Cc: *Arkas, Newt713, StarryGrandma, Pchelolo, EvanProdromou,
WMDE-leszek, Pamputt, Milimetric, Lea_Lacroix_WMDE, Smalyshev, Ayack,
Liuxinyu970226, Yurik, TheDJ, Aklapper, Bouzinac, Hook696, Daryl-TTMG,
RomaAmorRoma, 0010318400, E.S.A-Sheild, Meekrab2012, joker88john, CucyNoiD,
Lens0021, NebulousIris, Gaboe420, Versusxo, Majesticalreaper22,
Giuliamocci, Adrian1985, Cpaulf30, Af420, Darkminds3113, Bsandipan,
Lordiis, Capankajsmilyo, Adik2382, Th3d3v1ls, Ramalepe, Liugev6, SongTake,
WSH1906, Lewizho99, Maathavan, Base, Ricordisamoa, fbstj, Jay8g

Ok, i saw it. Couldn't the same fix be used for the english one etc?

Looks the graph has been build with a Lua code and a module here :
https://el.wikipedia.org/wiki/Module:Graph:Population_history

Le lun. 4 nov. 2019 à 16:40, Arkas <no-reply@phabricator.wikimedia.org> a
écrit :

Arkas added a comment. View Task
https://phabricator.wikimedia.org/T226250

In T226250#5630194 https://phabricator.wikimedia.org/T226250#5630194,
@Ayack https://phabricator.wikimedia.org/p/Ayack/ wrote:

In T226250#5628629 https://phabricator.wikimedia.org/T226250#5628629,
@Arkas https://phabricator.wikimedia.org/p/Arkas/ wrote:

I just saw that it must have been fixed. Good job, whoever that might be!
Thanks!

Unfortunately, it's not the case. Where have you seen a change?

In an article at the Greek wikipedia. Here you go:

https://el.wikipedia.org/wiki/%CE%86%CE%BD%CF%89_%CE%94%CE%BF%CE%BB%CE%B9%CE%B1%CE%BD%CE%AC_%CE%91%CF%81%CE%BA%CE%B1%CE%B4%CE%AF%CE%B1%CF%82#%CE%94%CE%B7%CE%BC%CE%BF%CE%B3%CF%81%CE%B1%CF%86%CE%B9%CE%BA%CE%AE_%CE%B5%CE%BE%CE%AD%CE%BB%CE%B9%CE%BE%CE%B7

Everybody please strip unneeded full quotes when replying via email to Phabricator tasks, to keep things readable. Thanks.

Change 526442 merged by Daniel Kinzler:
[mediawiki/services/graphoid@master] Improve user agent

https://gerrit.wikimedia.org/r/526442

So... this is all caused by the UA string not being set? Or not conforming to some pattern? I just merged @Ladsgroup's patch, should this fix the issue?

https://gerrit.wikimedia.org/r/c/mediawiki/services/graphoid/+/526442/1/routes/graphoid-v1.js

So... this is all caused by the UA string not being set? Or not conforming to some pattern? I just merged @Ladsgroup's patch, should this fix the issue?

https://gerrit.wikimedia.org/r/c/mediawiki/services/graphoid/+/526442/1/routes/graphoid-v1.js

It needs to be deployed first, I'm not sure if that's possible + it's the only issue here :(((

Perhaps it's the same as T214984? These examples on mediawiki.org currently include newlines in SPARQL, which apparently isn't allowed in proper JSON.

Hello, is someone working on this ? Thanks.

Hello, is someone working on this ? Thanks.

+1

Ping @Lea_Lacroix_WMDE @Lydia_Pintscher any news?

Hello,
Unfortunately, there is not much Lydia or me can do about it, as the initial and now unmaintained tool was not under the responsibility of WMDE's development team. I'm regularly poking people around, trying to figure out who's responsible and who can help. I'll try once again to get someone to provide a more satisfying answer :)

Sorry about the ping then.

From what I understand, the problem doesn't seem to be about the extension itself which works fine (see https://www.mediawiki.org/wiki/Extension:Graph#Charts_examples for one fo numerous examples), except when using Wikidata data via the query service (when directly using Wikidata data, the extension also works fine see https://fr.wikipedia.org/wiki/Pont-l%27%C3%A9v%C3%AAque#Production for an example).
Am I wrong somewhere in my assumptions?

Sorry about the ping then.

From what I understand, the problem doesn't seem to be about the extension itself which works fine (see https://www.mediawiki.org/wiki/Extension:Graph#Charts_examples for one fo numerous examples), except when using Wikidata data via the query service (when directly using Wikidata data, the extension also works fine see https://fr.wikipedia.org/wiki/Pont-l%27%C3%A9v%C3%AAque#Production for an example).
Am I wrong somewhere in my assumptions?

Well, yes and no. The extension depends on a service called Graphoid, that service is responsible for calling the Wikidata query service and building the data that is being used in the extension. The service has been unmaintained for years (T211881). It will be undeployed from production and will be replaced with the logic inside the extension IIRC but that's 1- Outside of control of Wikidata team 2- will take some time.

Hi there!

Just a quick note from me today, simply to say that we recently concluded a (very slow) search for a contractor to take on maintenance of the Graph extension, including undeployment of Graphoid. That work should start Soon™, though we likely will have exceeded our target of undeploying the service by mid-February. I would expect that maintenance work will continue after the service is sunsetted, including wrapping up any loose bugs.

Feel free to reach out if you have further questions, otherwise I expect there will be someone along shortly (read: within the next month or so) to triage and prioritize these things.

(hope that helps @Lea_Lacroix_WMDE)

@MarkTraceur Thanks for the update!
What would you advise to the projects who currently use the tool? Should they already start removing or replacing their currently broken graphs, or wait for more information?

Theoretically, if I understand the plan correctly, all rendering will happen client side. So any errors with graphs should be obvious going forward and I'd advise projects to keep things as they are and try the new approach. I still have doubts about how the performance of this approach as the page won't finish rendering until potentially a lot of data is downloaded, and how instead of downloading an image of a few KB, we might have potentially millions of devices downloading MB of data, but hopefully these things are considered carefully before sunsetting the service.

Hi @MarkTraceur, is there any fresh update to share about the maintenance of the Graph extension? :)

@Lea_Lacroix_WMDE work is ongoing to ensure that the frontend-only version of the extension will work as expected (read: minimal regressions). Our rough timeline seems to be 3-4 weeks, depending on staff capacity. Once that work is done (and Graphoid is de-deployed) we may continue by triaging and focusing on some of these bugs.

@MarkTraceur Awesome, thanks for the quick answer.
Just to make sure I understand correctly: does that mean that wikis who didn't touch the current code will have their graphs working again after these 3-4 weeks? Or do they have to change something in the code?

It seems like that the undeployment of graphoid will potentially fix this issue.

We are currently looking at whether there are any potential issues with client side queries being run en mass.

Will update asap.

Hello, it looks Wikidata graph are re-working there https://www.mediawiki.org/wiki/Template:Graph:Lines. Is this the final work or is there to wait for de-deployment of Graphoïd on the different langwikis ? Thanks

Hello, it looks Wikidata graph are re-working there https://www.mediawiki.org/wiki/Template:Graph:Lines. Is this the final work or is there to wait for de-deployment of Graphoïd on the different langwikis ? Thanks

Looking to deploy on further wikis this week. We are also creating the ability for communities to provide a static fallback image for users who don't use scripts, at least until a long term solution can be found for static image generation.

Dans T226250#6180310, @Jseddon a écrit :

Hello, it looks Wikidata graph are re-working there https://www.mediawiki.org/wiki/Template:Graph:Lines. Is this the final work or is there to wait for de-deployment of Graphoïd on the different langwikis ? Thanks

Looking to deploy on further wikis this week. We are also creating the ability for communities to provide a static fallback image for users who don't use scripts, at least until a long term solution can be found for static image generation.

Any news about that?

Hi, any date when the frwiki will be Graphoïd-undeployed? Thanks

@Bouzinac: Please see the corresponding task instead: T242855: Undeploy graphoid; thanks.

i don't know if there has been a specific fix, but the images renders now as it was when it was working. For instance here https://fr.wikipedia.org/wiki/Liste_des_a%C3%A9roports_les_plus_fr%C3%A9quent%C3%A9s_en_Oc%C3%A9anie#En_graphique

If nobody can reproduce anymore this task should probably be set to resolved or declined as per https://www.mediawiki.org/wiki/Bug_management/Bug_report_life_cycle - @Jseddon?

I can confirm that it seems to be fixed.

Hello, unfortunately at least enwiki is still problematic

@Bouzinac, could you point to a page on enwp that still suffers this issue?

On the template documentation page https://en.wikipedia.org/wiki/Template:Graph:Lines/doc the simple graph pulling data from wikidata works just fine, but the more complicated query times out.

Hello, as far i know, there is no longer any serious pb with Graph:Lines
I have corrected the sparql of the population wikidata example in all wikis documentations using the Graph:Lines.
Thanks for the work!

I've resolved this task since they are now working. Lets create a seperate task for performance with the WDQS.