Page MenuHomePhabricator

Parsoid cache invalidation for mobile-sections seems not reliable
Open, Needs TriagePublic


Here an example:

File:Andros Island, Bahamas.jpg has been removed from article Andros (Bahamas) in Wikivoyage in English the 12 June 2019.

Today, the 30 June 2019, if I ask Parsoid:

$ curl -I ''
HTTP/2 200 
date: Sun, 30 Jun 2019 11:15:21 GMT
content-type: application/json; charset=utf-8; profile=""
content-language: en
cache-control: s-maxage=1209600, max-age=0, must-revalidate
access-control-allow-origin: *
access-control-allow-methods: GET,HEAD
access-control-allow-headers: accept, content-type, content-length, cache-control, accept-language, api-user-agent, if-match, if-modified-since, if-none-match, dnt, accept-encoding
access-control-expose-headers: etag
x-content-type-options: nosniff
x-frame-options: SAMEORIGIN
referrer-policy: origin-when-cross-origin
x-xss-protection: 1; mode=block
content-security-policy: default-src 'none'; frame-ancestors 'none'
x-content-security-policy: default-src 'none'; frame-ancestors 'none'
x-webkit-csp: default-src 'none'; frame-ancestors 'none'
x-request-id: 06334f50-9a83-11e9-9747-c9e95e895697
server: restbase1024
vary: Accept-Encoding,X-Seven
etag: W/"3769898/5a9fcc70-6c7b-11e9-942f-f658c0e8f0a2"
x-varnish: 204806208, 682062014 101908520, 38372033 11769974
via: 1.1 varnish (Varnish/5.1), 1.1 varnish (Varnish/5.1), 1.1 varnish (Varnish/5.1)
age: 71006
x-cache: cp1079 pass, cp3030 hit/2, cp3040 hit/5
x-cache-status: hit-front
server-timing: cache;desc="hit-front"
strict-transport-security: max-age=106384710; includeSubDomains; preload
set-cookie: WMF-Last-Access=30-Jun-2019;Path=/;HttpOnly;secure;Expires=Thu, 01 Aug 2019 00:00:00 GMT
set-cookie: WMF-Last-Access-Global=30-Jun-2019;Path=/;;HttpOnly;secure;Expires=Thu, 01 Aug 2019 00:00:00 GMT
x-analytics: https=1;nocookies=1
x-client-ip: ..............
set-cookie: GeoIP=............; Path=/; secure;
accept-ranges: bytes

$ curl '' | grep 'Andros_Island,_Bahamas.jpg' | wc
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100 59694  100 59694    0     0   459k      0 --:--:-- --:--:-- --:--:--  455k
      1    1170   59695

I delivers a 20 days old revision instead of the last one.

Event Timeline

Like T217540 is one good source of broken/outdated HTML (here with invalid images) delivered by Parsoid. This pretty annoyous for offline version of our wikis.

@Aklapper The whole offline/Kiwix stuff is handled by the new-readers team. That said, totally unsure if this is right to put it here. I would really like to have a tag or project to gather all openZIM/Kiwix impacting tickets. Would that be somehow possible?

Could we do something to avoid the API to release months old pages. I wonder a bit that since two years, this ticket is not even triaged.

This bug generate a wide range of oddities at MWoffliner level. Here the last one

The problematic article is here:

Kelson renamed this task from Parsoid cache invalidation seems not reliable to Parsoid cache invalidation seems not reliable (only on mobile API?).Aug 15 2021, 4:04 PM
TheDJ renamed this task from Parsoid cache invalidation seems not reliable (only on mobile API?) to Parsoid cache invalidation for mobile-sections seems not reliable.Tue, Nov 30, 9:27 PM
TheDJ added a subscriber: TheDJ.

This ticket seems separate from the blacklist issue identified in the other ticket

@Kelson, u have any idea how often this problem occurs ?