Page MenuHomePhabricator

Defective static thumbnail, generated in page preview using an empty GeoJSON layer, is cached for long time after saving
Closed, ResolvedPublic

Description

The static map thumbnail images, shown when viewing a page with a mpaframe map, seems to be taking a long time to be generated/shown. At my sandbox (at enwiki), I made this edit at 23:03 UTC.
This added a map with two point features, via templates/modules. The equivilent wikitext for the mapframe map, by expanding templates, is

<mapframe height="200" frameless="1" align="center" width="290">[
{"type":"Feature","geometry":{"coordinates":[-83.00208694,39.962375],"type":"Point"},"properties":{"description":"39.96237500°N 83.00208694°W","title":"[[LeVeque Tower|American Insurance Union Citadel]]","marker-color":"5E74F3"}},
{"type":"Feature","geometry":{"coordinates":[-82.997111,39.942111],"type":"Point"},"properties":{"description":"39.942111°N 82.997111°W","title":"[[Krumm House|Krumm House]]","marker-color":"5E74F3"}}
]</mapframe>

Half an hour later, along with multiple purges, null edits, and a dummy edit, the page looks like this when viewed (left) and previewed in the wikitext editor (right):

Mapframe thumbnail (view and preview).png (741×1 px, 215 KB)

The preview of course uses a dynamic map rather than static image, so shows the two point features as expected. When viewing the article, the expected result is the same or very similar view but as a static image, but instead the zoomed-out world map is displayed.
I'm guessing that generating the static thumbnail image is taking a very long time, and that a default worldwide view is being shown instead.
For some reason, maybe bad caching or a race condition, an empty GeoJSON layer is initially being used instead of the actual GeoJSON when generating the static thumbnail; if no coordinates are explicitly set, this results in the default map being rendered.

The same thing happens at https://en.wikipedia.org/wiki/User:Evad37/Sandbox_10, which is using the mapframe tags directly rather than via templates/modules.

A very similar mapframe map, but with like 168 other point features, is currently displaying correctly for me at National Register of Historic Places listings in Columbus, Ohio, but a few hours ago another user reported that the map was broken (at https://en.wikipedia.org/wiki/Template_talk:NRHP_row#Wikidata_coordinates)

Event Timeline

Having checked again just now, 1:27 UTC, the thumbnail is now displayed correctly on both of my sandbox pages linked above.

Actually, the description of how the static image is generated is not accurate, the "default map" rendering happens when an empty GeoJSON is used to render the map and no coordinates were explicitly set for the mapframe.

The GeoJSON used to create another layer in the static map comes from either geoshapes service or mapdata lib requesting WDQS/MW. On Kartographer side, the data is collected from the ParserOutput.

This could be related to T158657: Kartotherian error: GroupId not available, which IMO is being caused by bad caching or some sort of racing condition in this issue.

Two weeks ago I described similar examples in T268927#6655678, as the issue in that task seemed at least partly the same. Earlier, on November 17 I added maps with geoline/geoshape to nearly 100 Wikipedia articles and then all snapshots (static images) were generated/served correctly right after saving, which also was the case before that date, as far as I can remember. So I'd say this issue has surfaced recently.

This comment was removed by Pikne.

Something interesting: If the size of the mapframe is adjusted, e.g. <mapframe height="200" to <mapframe height="201", then the static map is immediately generated correctly - and undoing causes the worldwide static map to appear again. It would appear the snapshot service is initially fed bad data (perhaps checking for group ids before they have been generated/saved?), and then that bad image is cached for a couple of hours.

@MSantos: Is there any way for the snapshot service, if it receives empty GeoJson, to either

  • try making the request again, perhaps after a short delay, to give whatever is generating/saving the missing data a chance to complete its work; and/or
  • mark the bad image as not-to-be-cached, or cached for a much shorter time span, so that the next load generates a correct image?

... now I added mapframe with geoline to 134 articles, and interestingly correct snapshot image...

I forgot to mention that I used a bot account (pywikibot). Now after having edited/added bunch of other maps, my impression is that correct snapshot is generally available right away after saving if I edit via bot account, and otherwise not. Does this imply anything useful?

Edit: This is now explained below. If I edit via bot script, then I don't use preview, and so defective static image also doesn't get cached during preview.

On ca.wiki it is reported that static maps are not rendered when you previously preview the dinamic map. It works fine if you save it without a preview. To reproduce it:

  • Check you don't have preview by default in your preferences.
  • Copy and paste de mapframe json of the description in a sandbox.
  • Click the preview button. It shows the correct map.
  • Save it. It shows a blank map.
  • Edit it, i.e. change height. Save without preview. It renders the correct static map.

Another weird thing that I have noticed is that if you change the zoom level of your browser, the details appear on or disappear from the map.

For example, I can see the details of the map in this article if I set the zoom level at 100%, but if I change it to, say, 130% the details disappear.

It's really necessary to use static thumbnail? Other WikiMedia projects, like Commons or Wikiviajes, do not use them. They load the interactive map directly.

It's really necessary to use static thumbnail? Other WikiMedia projects, like Commons or Wikiviajes, do not use them. They load the interactive map directly.

Good idea.

Pikne renamed this task from Mapframe static thumbnail taking a long time to generate to Defective static thumbnail, generated in page preview using an empty GeoJSON layer, is cached for long time after saving.Oct 27 2021, 8:16 AM
Pikne added a subscriber: Teslaton.

Isn't it possible to suppress requesting static thumbnail URL from preview mode (static thumbnail isn't used there anyway, AFAIK, live map iframe is displayed instead), so that first request will be made later, from an already saved page (when context data for proper rendering are definitely available)?

Or to add some URL param to make thumbnail cache key different depending on whether made from preview/non-preview request.

So.. i'm thinking through this a bit. This particular case. The static PNG is broken, because the 'group' isn't available yet (which is because we first rendered it as a preview).

The static image link is included in the preview, WITH group param, so when the page is rendered, the static map renderer is called because the img attribute is in the html. The static map renderer fails to retrieve the group details and thus cannot add the layer. The image is still generated and cached under the URL with the group. After save the broken image is returned, even though the groupid information can now be retrieved....

Observations made:

  • highDPI version and normal version sometimes differ. This would not be unexpected if we have this problem and depends on what browser and sizing the browser had while deciding which image to load first in the srcset.
  • Changing the size or json of the map fixes it. Logical, because it varies the url of the image.
  • If above is true, this problem should only occur on wikipedia and mediawiki.org, because sister wikis do not have StaticMapFrame set....

So if we remove the staticmap img from preview parseroutput... then that would solve the problem right ?

Change 735066 had a related patch set uploaded (by TheDJ; author: TheDJ):

[mediawiki/extensions/Kartographer@master] Don't generate img urls with dynamic groups for previews

https://gerrit.wikimedia.org/r/735066

There may be another level to look at: varnish shouldn't cache error responses, for exactly this sort of race condition. Typically an HTTP 5xx status will trigger no-cache, but I think we're getting hit by the server's liberality: it continues to respond with a 200 even for degraded maps. Ideally, we would return status 200, with caching disabled for both public and private caches. I'll create a subtask to look at this further.

Example map snapshot with a bad group: https://maps.wikimedia.org/img/osm-intl,6,53.383333,-1.466667,300x400.png?lang=en&domain=en.wikipedia.org&title=Downton+Abbey&groups=_12345

Work done here already seems correct: even if varnish stops caching, we still want to respond to a bad group by rendering the base map without overlays.

Not caching negative responses may extend surface for DDoS attacks.

Change 735066 merged by jenkins-bot:

[mediawiki/extensions/Kartographer@master] Don't generate img urls with dynamic groups for previews

https://gerrit.wikimedia.org/r/735066

While summing up T293841 I realized this change here will create a spike on the Kartotherian server (the server that renders the static .png map images). The reason is that the patch https://gerrit.wikimedia.org/r/735066 introduced a new URL scheme that didn't exist before and was never cached anywhere. Kartotherian will be hit every time a page is previewed for the first time after deployment, no matter if a map on the page was edited or not. This will spike for a short time in the rate previews are made on the source wiki, and hopefully flatten fast.

This was different before. Before a new map image URL was only created when a map was edited. But this is rare. Before, the chance a preview would hit Kartotherian was close to 0.

Merging the patch was still the right thing to do.

The good parts:

  • Edits are rare compared to page views.
  • Pages with maps are still rare.
  • The mapdata API won't be affected. The new URL scheme doesn't even contain anything that would allow requesting mapdata from the source wiki.
  • Rendering these new maps should be fast because they never contain any geometry, markers or such.

Still I find it hard to estimate the actual impact. Is there a monitor for the load on Kartotherian?

@thiemowmde you probably want to check out the grafana dashboard, specially the "Static Snapshot requests" graph at https://grafana.wikimedia.org/d/000000305/maps-performances

The alternative could be to completely remove the img element from the preview of course, but I don’t know off the top of my head I’f the JS module will handle that….

I think this task can be closed. TheDJ's patch fixed the main issue outlined in task description. If necessary then separate tasks can be created for cases where previews have always been broken or as a trade-off are a bit more broken now. There's also T203863.

Pikne assigned this task to TheDJ.