Page MenuHomePhabricator

Caching problems with mobile_main_page
Closed, ResolvedPublic

Description

Author: ninniuz

Description:
Specific mobile wikipedia main pages seem to be cached and ALWAYS retrieved from the server cache.
Typically those webpages contain references to contents which change each day.

E.g.
http://it.m.wikipedia.org still reports news from Dec 27th
http://sv.m.wikipedia.org reports news from Dec 20th


Version: unspecified
Severity: critical

Details

Reference
bz22014

Event Timeline

bzimport raised the priority of this task from to High.Nov 21 2014, 10:51 PM
bzimport set Reference to bz22014.

hcatlin wrote:

I am looking into this, but something is very, very wrong here.

It seems to be an issue upstream with mediawiki. This is not cached
locally at all right now, and yet Squid cache keeps sending ruby
an old version. Why? I have no idea.

I have tried all sorts of differing headers and etc, but the problem
persists.

manop wrote:

I just want to report this bug as well.

In Thai Wikipedia Mobile
http://th.m.wikipedia.org

The cache has been there since the the project started. The page still has the data from March 2, 2010 (eg. the news was obsolete).

  • Bug 25804 has been marked as a duplicate of this bug. ***

hcatlin wrote:

The strangeness with this bug continues. The cache is being cleared. If I install a fresh copy of mobile, it gets an old version of the page. The data appears nowhere in the code, at all.

Using a curl session, I get this:

About to connect() to th.wikipedia.org port 80 (#0)

  • Trying 91.198.174.232... * Connected to th.wikipedia.org (91.198.174.232) port 80 (#0)

GET /wiki/%E0%B8%AA%E0%B8%96%E0%B8%B2%E0%B8%99%E0%B8%B5%E0%B8%A2%E0%B9%88%E0%B8%AD%E0%B8%A2:%E0%B8%AB%E0%B8%99%E0%B9%89%E0%B8%B2%E0%B8%AB%E0%B8%A5%E0%B8%B1%E0%B8%81_%28%E0%B9%82%E0%B8%A1%E0%B8%9A%E0%B8%B2%E0%B8%A2%E0%B8%A5%E0%B9%8C%29 HTTP/1.1

Host: th.wikipedia.org
Accept: */*
Accept-Encoding: gzip,deflate
User-Agent: Mozilla/5.0 Wikimedia Mobile
Accept-Charset: utf-8;q=0.7,*;q=0.7

  • HTTP 1.0, assume close after body

< HTTP/1.0 200 OK
< Date: Thu, 04 Nov 2010 02:07:34 GMT
< Server: Apache
< Cache-Control: private, s-maxage=0, max-age=0, must-revalidate
< Content-Language: th
< Vary: Accept-Encoding,Cookie
< Last-Modified: Mon, 01 Nov 2010 15:14:05 GMT
< Content-Encoding: gzip
< Content-Length: 10891
< Content-Type: text/html; charset=UTF-8
< X-Cache: HIT from sq60.wikimedia.org
< X-Cache-Lookup: HIT from sq60.wikimedia.org:3128
< Age: 467872
< X-Cache: HIT from amssq34.esams.wikimedia.org
< X-Cache-Lookup: HIT from amssq34.esams.wikimedia.org:3128
< X-Cache: MISS from amssq40.esams.wikimedia.org
< X-Cache-Lookup: MISS from amssq40.esams.wikimedia.org:80
< Connection: close
<

  • Expire cleared
  • Closing connection #0

Which, returns HTML that is the exact same as the "old" version. When I view the same page in my browser, I get a new version.

You can try this yourself if you are on UNIX.

curl -H "Accept: */*" -H "Accept-Encoding: gzip,deflate" -H "Accept-Charset: utf-8;q=0.7,*;q=0.7" -A "Mozilla/5.0 Wikimedia Mobile" http://th.wikipedia.org/wiki/%E0%B8%AA%E0%B8%96%E0%B8%B2%E0%B8%99%E0%B8%B5%E0%B8%A2%E0%B9%88%E0%B8%AD%E0%B8%A2:%E0%B8%AB%E0%B8%99%E0%B9%89%E0%B8%B2%E0%B8%AB%E0%B8%A5%E0%B8%B1%E0%B8%81_%28%E0%B9%82%E0%B8%A1%E0%B8%9A%E0%B8%B2%E0%B8%A2%E0%B8%A5%E0%B9%8C%29 -v | gunzip

That causes the problem again. If you want, you can grep the output for words in either of the pages.

It only occurs when gzip is turned on. If you ask for the non-gzipped page, you get a fresh copy.

hcatlin wrote:

I'm assigning this bug to Wikimedia because I can't believe this is the intendent behaviour.

(In reply to comment #6)

I think that the correctly cached version will have a bit different url, like
http://th.wikipedia.org/wiki/%E0%B8%AA%E0%B8%96%E0%B8%B2%E0%B8%99%E0%B8%B5%E0%B8%A2%E0%B9%88%E0%B8%AD%E0%B8%A2:%E0%B8%AB%E0%B8%99%E0%B9%89%E0%B8%B2%E0%B8%AB%E0%B8%A5%E0%B8%B1%E0%B8%81_(%E0%B9%82%E0%B8%A1%E0%B8%9A%E0%B8%B2%E0%B8%A2%E0%B8%A5%E0%B9%8C)

Note that mediawiki cannot purge each url encoding combination.

Is there any reason why we don't give HTTP redirects from http://en.wikipedia.org/wiki/foo(bar%29 --> http://en.wikipedia.org/wiki/foo(bar) like we do if you make a request with %20 in it? It seems like (saying this without knowing the issues involved) that well each page can have multiple url encodings, it should only have one canonical encoding that we should redirect people to if they use an alternate url form.

Until you guys decide what the correct way to avoid the cache is, could we have an workaround put in place?

Purging the cache on the Wiki page updates the mobile site, so perhaps you could consider temporarily using a cronjob to do this at 00:01 each day. The timezone should depend on the language.

(In reply to comment #8)

Purging the cache on the Wiki page updates the mobile site, so perhaps you
could consider temporarily using a cronjob to do this at 00:01 each day. The
timezone should depend on the language.

I thought this whole bug was that purging the wiki page did not update the mobile site (?)

(In reply to comment #9)

I thought this whole bug was that purging the wiki page did not update the
mobile site (?)

My understanding is that the issue is the squid sends an old cache. Why this happens is unknown - although I suspect it's because the mobile app actually HITS the cache, opposed to logged-in users on wp who MISS it because of the settings they have.

I tried on ro.wp to purge the cache and it worked. Also, on sv.m.wp there is a link to purge the cache, so they might have noticed the same thing themselves. The update is not immediate, it takes a few minutes, but not days.

hcatlin wrote:

The mobile site invalidates the homepage cache every couple hours. These really old pages have nothing to do with that. If you clear the mobile cache and start again... it *still* gives old versions.

Got a complaint about this on OTRS a few days ago.

(In reply to comment #11)

The mobile site invalidates the homepage cache every couple hours. These really
old pages have nothing to do with that. If you clear the mobile cache and start
again... it *still* gives old versions.

We should fix this behavior on the Wikimedia side, but in the meantime you're probably best off sending a Cookie header or something to bypass Squid.

Assigning to Tomasz since this is Mobile-related and he'll need to have someone track down what is happening
on the squid side.

hcatlin wrote:

What kind of cookie do I send? Just... any cookie at all?

I would be very happy to fix this!

(In reply to comment #13)

We should fix this behavior on the Wikimedia side, but in the meantime you're
probably best off sending a Cookie header or something to bypass Squid.

Can't you just send the parentheses as decoded characters rather than %28 and %29?

Would like to get this fixed, but shouldn't be a 1.18 deployment blocker.

could someone test this on the new mobile front end?

How can we test this? Do you have any page that changes daily? Also, I've gone to the site mentioned in http://blog.wikimedia.org/2011/06/10/testing-mobile-prototype/ and there is no mobile interface, at least on my Nokia 5800. Is there something I'm doing wrong?

(In reply to comment #19)

How can we test this? Do you have any page that changes daily? Also, I've gone
to the site mentioned in
http://blog.wikimedia.org/2011/06/10/testing-mobile-prototype/ and there is no
mobile interface, at least on my Nokia 5800. Is there something I'm doing
wrong?

You can pull up the the new gateway by loading the following url - http://en.wikipedia.org/wiki/Main_Page?useformat=mobile .

Once we resolve bug #29505 you'll be able to follow our new instructions at http://blog.wikimedia.org/2011/08/17/calling-mobile-testers-for-round-two/ to test without having to add useformat=mobile for each request.

On en.wp the date is correct, but on ro.wp the date shown is 16th of August.

According to http://ro.wikipedia.org/wiki/Special:Versiune , Mobile Frontend is active on ro.wp, so I expect http://ro.wikipedia.org/wiki/Pagina_principal%C4%83?useformat=mobile takes me to the new gateway. If this is the case, then the bug is not fixed.

preilly wrote:

This should be fixed now with the new Mobile Frontend extension.

(In reply to comment #22)

This should be fixed now with the new Mobile Frontend extension.

Except it isn't - see my comment #21. I also tested with it and sv tonight and they're still wrong. It might be just some deployment problem that makes me see the old gateway, but in this case I would like to see some more precise instructions on how to debug this. I am willing to continue testing this.

preilly wrote:

(In reply to comment #23)

(In reply to comment #22)

This should be fixed now with the new Mobile Frontend extension.

Except it isn't - see my comment #21. I also tested with it and sv tonight and
they're still wrong. It might be just some deployment problem that makes me see
the old gateway, but in this case I would like to see some more precise
instructions on how to debug this. I am willing to continue testing this.

Does the W logo in the upper right corner have the word, "beta" in it?

preilly wrote:

(In reply to comment #24)

(In reply to comment #23)

(In reply to comment #22)

This should be fixed now with the new Mobile Frontend extension.

Except it isn't - see my comment #21. I also tested with it and sv tonight and
they're still wrong. It might be just some deployment problem that makes me see
the old gateway, but in this case I would like to see some more precise
instructions on how to debug this. I am willing to continue testing this.

Does the W logo in the upper right corner have the word, "beta" in it?

Sorry, I meant upper left corner.

(In reply to comment #25)

Sorry, I meant upper left corner.

Yes, I got that :) No, there is no beta next to the W, so this means I probably see the old version.

But why does this happens, since the Mobile Frontend extension is installed on all the wikis?

preilly wrote:

(In reply to comment #26)

(In reply to comment #25)

Sorry, I meant upper left corner.

Yes, I got that :) No, there is no beta next to the W, so this means I probably
see the old version.

But why does this happens, since the Mobile Frontend extension is installed on
all the wikis?

You need to opt-in to the new beta:
http://en.wikipedia.org/?mobileaction=opt_in_mobile_site

Just change the language to the one that you're currently testing.

(In reply to comment #27)

(In reply to comment #26)

(In reply to comment #25)

Sorry, I meant upper left corner.

Yes, I got that :) No, there is no beta next to the W, so this means I probably
see the old version.

But why does this happens, since the Mobile Frontend extension is installed on
all the wikis?

You need to opt-in to the new beta:
http://en.wikipedia.org/?mobileaction=opt_in_mobile_site

Just change the language to the one that you're currently testing.

Yes, this time I got the "W beta" and today's news, so I guess we can close this bug.

I have just one more question: after opt-in, the home page was empty. What page should have been displayed there?

preilly wrote:

(In reply to comment #28)

(In reply to comment #27)

(In reply to comment #26)

(In reply to comment #25)

Sorry, I meant upper left corner.

Yes, I got that :) No, there is no beta next to the W, so this means I probably
see the old version.

But why does this happens, since the Mobile Frontend extension is installed on
all the wikis?

You need to opt-in to the new beta:
http://en.wikipedia.org/?mobileaction=opt_in_mobile_site

Just change the language to the one that you're currently testing.

Yes, this time I got the "W beta" and today's news, so I guess we can close
this bug.

I have just one more question: after opt-in, the home page was empty. What page
should have been displayed there?

This is due to that language not using the default selectors for the sections. This is addressed in another bug.

  • Bug 30772 has been marked as a duplicate of this bug. ***

(In reply to comment #29)

(In reply to comment #28)

(In reply to comment #27)

(In reply to comment #26)

(In reply to comment #25)

Sorry, I meant upper left corner.

Yes, I got that :) No, there is no beta next to the W, so this means I probably
see the old version.

But why does this happens, since the Mobile Frontend extension is installed on
all the wikis?

You need to opt-in to the new beta:
http://en.wikipedia.org/?mobileaction=opt_in_mobile_site

Just change the language to the one that you're currently testing.

Yes, this time I got the "W beta" and today's news, so I guess we can close
this bug.

I have just one more question: after opt-in, the home page was empty. What page
should have been displayed there?

This is due to that language not using the default selectors for the sections.
This is addressed in another bug.

Were tracking missing main pages in https://bugzilla.wikimedia.org/show_bug.cgi?id=30785