Page MenuHomePhabricator

Geoshapes service unavailable since March 1st
Closed, ResolvedPublicBUG REPORT

Description

From this page where map references geoline object, the following requests return HTTP 503:

The same seems to apply to all maps with geoshape/geoline objects. This happens since yesterday. I see patches in T301664 were deployed yesterday. Is this related?

Event Timeline

WMSE are working on a large biodiversity project that is dependent on using these maps, please could this fixed as soon as possible.

For me most maps are completely blank, sometimes they load after a few mins. A few of the maps have a base layer but any shapes highlighted do not appear. No maps load correctly. This is true for both English Wikipedia and Wikidata.

Just FYI @Jgiannelos and @TheDJ since you worked on the other task

Currently we reverted to previous working state so geoshapes service should be working again.

Jgiannelos claimed this task.

@Jgarrard I checked it earlier and it was working, but now its not again :(

This map doesn't appear at all https://en.wikipedia.org/wiki/User:John_Cummings/Experiments/Species_distribution_map

And this map doesn't show the geoshapes its supposed to (you can briefly see it flash if you refresh the page https://www.wikidata.org/wiki/User:Alicia_Fagerving_(WMSE)/sandboxMaps

Thanks @awight weirdly none of it works for me, it all fails, the map doesn't load on my user page and the other pages just say either '"Bad geojson - unknown type ExternalData" or 'Cannot GET /geoshape'

I'm reopening this task since reverting to the previous state has not resolved the issue

Current situation isn't quite the same as the one I outlined in task description. At least some geoshapes can be requested now, e.g. this currently works for me: https://maps.wikimedia.org/geoline?getgeojson=1&ids=Q19686. But yes, at the moment the service works quite unreliably. GeoJSON request in task description worked for me around the time of yesterday's reversion, but now it returns HTTP 404.

thiemowmde changed the subtype of this task from "Task" to "Bug Report".

My understanding is that geoshapes can be taken from both Commons and OpenStreetMap, for me neither work:

https://en.wikipedia.org/wiki/User:John_Cummings/Experiments/Species_distribution_map this map uses geoshapes from OSM

I think we can close this task. Some examples above at some point after March 2nd's rollback didn't work for me either, but it seems this solved itself and in past days there haven't been problems with given examples. If certain object is persistently unavailable (see T218097) or if something like T268927 reoccurs then probably new ticket should be opened.

@Pikne @Jgiannelos this is still completely broken for me, sometimes the maps are visible, sometimes not. all maps are broken when you click on the fullscreen button.

Does anyone know why this still doesn't work?

For context I'm working on species distribution maps for English Wikipedia so I want to use these maps on 100,000s of articles and have the data collected already.

Thanks

Reopening since its still broken (I've checked with several people and its broken for them as well)

It does seem a bit flaky. When it does work, it's sometimes very slow. For instance John's species distribution map took 27 seconds to return an image. This seems much slower than before and might give some ppl the impression it is not working at all of course.

Once requested, it seems faster for a certain amount of time, which probably indicates that the result is now cached.

@TheDJ do you have any suggestions on what might be broken?

This is a more complicated map which takes even longer to load (its a distribution map for a family of plants) https://en.wikipedia.org/wiki/User:John_Cummings/Experiments/Species_distribution_map_multiple

I've also just realised that the maps are now a static image, they used to be interactive with zoom and ability to click on things. Weirdly this interactivity is broken on English Wikipedia but working on outreach wiki....

https://outreach.wikimedia.org/wiki/GLAM/Newsletter/February_2022/Contents/Content_Partnerships_Hub_report

I've also just realised that the maps are now a static image, they used to be interactive with zoom and ability to click on things. Weirdly this interactivity is broken on English Wikipedia but working on outreach wiki....

No you are misremembering that. Dynamic maps are only on some very particular wiki's (wikidata, Wikivoyage etc). Most (high volume) wikis are statically pre-rendered images and only become interactive upon clicking to make them fullscreen.

Ah, ok, thanks for the correction :)

@John_Cummings I do note that some of your maps are highly detailed. Like the one on outreach has 300KB of shapes. That is a lot !

An effort should always be made to keep shapes as simple as possible. Don't just export the contours of a country in OSM and export them, they need as much 'simplification' of the detail level as possible (because in shapes, we don't really have detail levels, unlike in OSM). As an example... I doubt the distribution of a species tracks the exact borders of the national parks in Palestine that seems kinda unbelievable. Yet these parks have a very high level of border detail, causing very large shapes for this small level of distribution.

@TheDJ so to the best of my knowledge they aren't shape files copied from OSM into Commons, they're coming straight from OSM, so there isn't to my knowledge a way of simplifying the shapes. If the shapes are too big is there a way to simplify them automatically like with thumbnails generated for jpegs?

Also the distribution data comes from Kew's Plants of the World Online, that's how they present the data (as whole countries or regions) and its what Wikiproject Plants have decided is the best source.

Also just to say that there are over 1 million shapes on OSM with Wikidata links that can be used in these maps so copying shapes over to Wikimedia Commons and simplifying them is not something that seems realistic or desirable. Is there some way to simplify them when displaying the OSM shapes?

There is some simplification that happens automatically, whenever OSM shapes are used in our maps. I believe the default is to use this query, which I don't understand well enough to say anything about. The ST_Simplify function is given a "tolerance" parameter which controls the level of simplification, and this is parameterized as $3 in the query. Following that back to the mapframe itself seems like a dead-end, I think the author intended customizability but it doesn't seem to be possible to take advantage of this at the moment.

We could talk about reviving this feature, or alternatively tune the defaults so that shapes are always simpler.

See the proposed investigation T303584.

... '"Bad geojson - unknown type ExternalData" or 'Cannot GET /geoshape'

Do you still get either of these? Is it just slow or you still don't retrieve the data? For example, if the largest GeoJSON request in your last example (link) doesn't work then what result/error exactly you get? I've been checking examples from above over the past week and for me they work.

Note that old static image examples, such as this, don't work or are without overlay objects. This is expected because example wiki page have changed and static image now corresponds to old page revision.

I'm trying to figure what exactly do we ask for in this task apart from the issue that was resolved with March 2nd's rollback. In its current form this task is probably no longer actionable. It still seems to me that other reproducable issues should be outlined clearly in new ticket(s).

If different users really get different results for the same geoshape request, then maybe someone can check if it's something like T268927 again? Or what else may cause the divergence? Browser cache?

As for slowness, I think this is an old issue. I remember requests for small/individual objects used to be quicker, but at some point about 1–2 years ago it got considerably slower. Requests for many objects (including those with Wikidata queries) have always been slow, as far as I can remember. Apart from shape simplification or possibly flawed service setup, consider that users aren't limited from requesting large amount of (simplified) data (see T149527). So maybe some more general performance review is needed?

Hi @Pikne

Fixed

  • Some of the maps display eventually, there are no more error messages.

Not fixed

  • Some maps don't load at all, e.g https://en.wikipedia.org/wiki/User:John_Cummings/Experiments/Species_distribution_map_multiple
  • The maps are extremely slow to load (which wasn't happening before March). The maps loaded with a second or two, now they are taking 30 seconds to load the small map and the same to load the big map if you click on full screen. I know they were quicker before this happened because I was playing around with the software a lot in the past month. I would suggest that maps taking 30 seconds load is still quite broken.

Thanks

Change 769976 had a related patch set uploaded (by Jgiannelos; author: Jgiannelos):

[mediawiki/services/kartotherian@master] Improve geoshapes PostGIS query performance

https://gerrit.wikimedia.org/r/769976

Change 769976 merged by jenkins-bot:

[mediawiki/services/kartotherian@master] Improve geoshapes PostGIS query performance

https://gerrit.wikimedia.org/r/769976

These are expensive queries though. It's important to note that geoshapes and snapshot are rendered on the fly and then stored in the varnish cache layer.

So, if you use it consistently, the map will load very fast except for the first time.

We can do better though, aside from Yiannis patch which will bring a lot of improvements, we can take advantage of imposm generalised tables, and simplify the geometries once in the import phase.

OBS: these comments and suggestions are not related to the service instability, which did happen due to tweaks in the PG config

@MSantos should another phab ticket be created for these improvements? It looks like the fix sped up the generation considerably which is really great but the more complicated maps still take quite a while to load e.g https://en.wikipedia.org/wiki/User:John_Cummings/Experiments/Species_distribution_map_multiple

Does this mean that the map would still be generated once for each user, or once overall unless the page changes (like generating a thumbnail)?