Page MenuHomePhabricator

On zh.wikipedia.org, MediaWiki wrongly interprets an article about main pages as the wiki's main page
Open, LowPublicBUG REPORT

Description

In zhwiki, both pages ( [[Wikipedia:首页]] and [[首页]] ) are rendered as if they were a main page.

Symtoms:

  • wgIsMainPage is true
  • browser title bar of [[首页]] has no page title, instead it says the site name "Wikipedia",see screenshot
  • The first tab says "Main Page" instead of "Article"

Event Timeline

Shizhao created this task.Jul 31 2019, 3:36 AM
Restricted Application added subscribers: Cosine02, Aklapper. · View Herald TranscriptJul 31 2019, 3:36 AM
Cwek added a subscriber: Cwek.Jul 31 2019, 3:43 AM

When I try to get mw.config.values.wgIsMainPage in [[首页]], it undefined. And I can get mw.config.values.wgIsMainPage = true in [[Wikipedia:首页]].

Hamishcn added a subscriber: Hamishcn.EditedJul 31 2019, 5:23 AM

I don't know if it should be posted here or a new task. But I found that at pages recently edited of main NS, the string under main title shows '出自Wikipedia' instead of '维基百科,自由的百科全书' now.

在T229379#5378564中,@RazeSoldier写道:

When I try to get mw.config.values.wgIsMainPage in [[首页]], it undefined. And I can get mw.config.values.wgIsMainPage = true in [[Wikipedia:首页]].

I get mw.config.values.wgIsMainPage in [[首页]] and [[Wikipedia:首页]] is true....

Viztor added a subscriber: Viztor.Jul 31 2019, 8:12 AM

Reproducible. It is likely the PHP7 instance of zhwiki has a different config than the regular version, or some other deployed versions. It's probably just a configuration error.

Shizhao added a comment.EditedAug 1 2019, 6:32 AM

maybe cache problem?

see screenshot https://t.me/wikipedia_zh_n/730220

An mix page: [[首页]] title and description + [[WP:首页]] content

Krinkle updated the task description. (Show Details)Aug 1 2019, 3:30 PM
Krinkle renamed this task from wgIsMainPage error in zhwiki to On zh.wikipedia.org, MediaWiki wrongly interprets an article about main pages as the wiki's main page.Aug 1 2019, 3:33 PM
Krinkle updated the task description. (Show Details)
Krinkle removed a project: MediaWiki-Cache.
Krinkle added a subscriber: Krinkle.

Does not seem related to caching, and not related to the mw.config variable either. It seems that MW genuinely sees this title as being equal to the wiki's main page, with all expected side-effects that brings.

Maybe it is ignoring the namespace prefix somehow?

sometimes, in left sidebar logo link to [[首页]], and "首页" link to [[Wikipedia:首页]]

I've confirmed via WikimediaDebug that force-refreshing https://zh.wikipedia.org/wiki/首页?uselang=en with "PHP 7.x" enabled leads to the correct result of it being a regular page, and with "PHP 7.x" off (HHVM), it consistently leads to the incorrect result of it being like a Main Page.

Viztor added a comment.EditedAug 1 2019, 8:31 PM

I've confirmed via WikimediaDebug that force-refreshing https://zh.wikipedia.org/wiki/首页?uselang=en with "PHP 7.x" enabled leads to the correct result of it being a regular page, and with "PHP 7.x" off (HHVM), it consistently leads to the incorrect result of it being like a Main Page.

<s>How do you force PHP 7.x/HHVM?</s>
Oh, no worries. You already said.

Krinkle added a comment.EditedAug 1 2019, 8:34 PM

No problem :) - so, by using WikimediaDebug :)

Viztor added a comment.EditedAug 1 2019, 11:04 PM

It appears PHP7.x instances are correctly configured, and the problems rest with HHVM instances.

WDoranWMF triaged this task as Low priority.Aug 6 2019, 6:39 PM

Hm. Trying to open https://zh.wikipedia.org/wiki/首页 both logged in, logged out and bypassing Varnishes gives me a normal page now while opening https://zh.wikipedia.org/wiki/Wikipedia:首页 gives the main page. Verified that served by HHVM. Is this still reproducible for others?

Hm. Trying to open https://zh.wikipedia.org/wiki/首页 both logged in, logged out and bypassing Varnishes gives me a normal page now while opening https://zh.wikipedia.org/wiki/Wikipedia:首页 gives the main page. Verified that served by HHVM. Is this still reproducible for others?

Appear to be correct after several tests here as well.

在T229379#5398021中,@Pchelolo写道:

Hm. Trying to open https://zh.wikipedia.org/wiki/首页 both logged in, logged out and bypassing Varnishes gives me a normal page now while opening https://zh.wikipedia.org/wiki/Wikipedia:首页 gives the main page. Verified that served by HHVM. Is this still reproducible for others?

This bug is not reproducible every time...

在T229379#5398056中,@Viztor写道:

Hm. Trying to open https://zh.wikipedia.org/wiki/首页 both logged in, logged out and bypassing Varnishes gives me a normal page now while opening https://zh.wikipedia.org/wiki/Wikipedia:首页 gives the main page. Verified that served by HHVM. Is this still reproducible for others?

Appear to be correct after several tests here as well.

I am still able to reproduce this problem

在T229379#5398056中,@Viztor写道:

Hm. Trying to open https://zh.wikipedia.org/wiki/首页 both logged in, logged out and bypassing Varnishes gives me a normal page now while opening https://zh.wikipedia.org/wiki/Wikipedia:首页 gives the main page. Verified that served by HHVM. Is this still reproducible for others?

Appear to be correct after several tests here as well.

I am still able to reproduce this problem

It would be really helpful if you posted the response headers. Also, if you could bypass frontend cache with, for example, a dummy query parameter like ?a=b and tried again.

Shizhao added a comment.EditedAug 7 2019, 3:10 AM

In browser incognito mode (no cookie, no login ), can't reproduce.

In normal mode reproduce this problem (login), bypass frontend cache and/or clean cookie still reproducible. response headers:

accept-ranges: bytes
age: 0
backend-timing: D=872811 t=1565146491541012
cache-control: private, s-maxage=0, max-age=0, must-revalidate
content-encoding: gzip
content-language: zh
content-security-policy-report-only: script-src 'unsafe-eval' 'self' meta.wikimedia.org *.wikimedia.org *.wikipedia.org *.wikinews.org *.wiktionary.org *.wikibooks.org *.wikiversity.org *.wikisource.org wikisource.org *.wikiquote.org *.wikidata.org *.wikivoyage.org *.mediawiki.org 'unsafe-inline'; default-src 'self' data: blob: upload.wikimedia.org https://commons.wikimedia.org meta.wikimedia.org *.wikimedia.org *.wikipedia.org *.wikinews.org *.wiktionary.org *.wikibooks.org *.wikiversity.org *.wikisource.org wikisource.org *.wikiquote.org *.wikidata.org *.wikivoyage.org *.mediawiki.org wikimedia.org; style-src 'self' data: blob: upload.wikimedia.org https://commons.wikimedia.org meta.wikimedia.org *.wikimedia.org *.wikipedia.org *.wikinews.org *.wiktionary.org *.wikibooks.org *.wikiversity.org *.wikisource.org wikisource.org *.wikiquote.org *.wikidata.org *.wikivoyage.org *.mediawiki.org wikimedia.org 'unsafe-inline'; report-uri /w/api.php?action=cspreport&format=json&reportonly=1&
content-type: text/html; charset=UTF-8
date: Wed, 07 Aug 2019 02:54:52 GMT
expires: Thu, 01 Jan 1970 00:00:00 GMT
last-modified: Wed, 07 Aug 2019 02:39:48 GMT
server: mw1319.eqiad.wmnet
server-timing: cache;desc="pass"
status: 200
strict-transport-security: max-age=106384710; includeSubDomains; preload
vary: Accept-Encoding,Cookie,Accept-Language,Authorization,X-Seven
x-analytics: ns=0;page_id=135739;loggedIn=1;WMF-Last-Access=07-Aug-2019;WMF-Last-Access-Global=07-Aug-2019;https=1
x-cache: cp1085 pass, cp2016 pass, cp5008 pass, cp5009 pass
x-cache-status: pass
x-client-ip: 35.220.191.193
x-content-type-options: nosniff
x-powered-by: PHP/7.2.16-1+0~20190307202415.17+stretch~1.gbpa7be82+wmf1
x-varnish: 132737294, 314620476, 650486249, 1023716726
Shizhao changed the subtype of this task from "Task" to "Bug Report".Aug 7 2019, 12:33 PM
Krinkle added a comment.EditedAug 7 2019, 2:32 PM

Hm. Trying to open https://zh.wikipedia.org/wiki/首页 both logged in, logged out and bypassing Varnishes gives me a normal page now while opening https://zh.wikipedia.org/wiki/Wikipedia:首页 gives the main page. Verified that served by HHVM. Is this still reproducible for others?

Open https://zh.wikipedia.org/wiki/首页?uselang=en, enable WikimediaDebug (which bypasses Varnish and uses HHVM by default).

You'll see the first tab is labelled "Main Page" instead of "Article" which is the symptom of the problem, given that that is not the wiki's main page. Another symptom is the HTML title being "Wikipedia" (no article name present).

@Krinkle hm, ok, now I can reproduce, but it's interesting.

In chrome, if I clear the cookies and open https://zh.wikipedia.org/wiki/首页?uselang=en for the first time, I see the page title just "Wikipedia" - so, reproduced. If I refresh the page without clearing the cookies, sometimes it changes to 首页 - Wikipedia and then stays like that. However, one time it didn't and was stuck on "Wikipedia".

I can both reproduce and not reproduce with x-powered-by: HHVM/3.18.6-dev

Did the intermediate weird ones get a 304 Not Modified response or 200 OK? Our IMS validation is mainly based on when the page is last modified, so local browser cache does not vary by PHP version. This has bitten me many times also when switching between production and X-Wikimedia-Debug as the debug request can correctly route there, but then get a 304 Not Modified response from the debug server (and thus effectively make the browser re-use the old HTML).

@Krinkle checkd again. I can both reproduce and not reproduce with a 200.

Example:

  1. Clear browsing data
  2. Request https://zh.wikipedia.org/wiki/%E9%A6%96%E9%A1%B5?uselang=en -> reproduced, response status: 200
  3. Request again -> reproduced, 304
  4. Request https://zh.wikipedia.org/wiki/%E9%A6%96%E9%A1%B5?uselang=en&a=b -> not reproduced, 200.
  5. Request https://zh.wikipedia.org/wiki/%E9%A6%96%E9%A1%B5?uselang=en -> 304
  6. Clear browser cache, request https://zh.wikipedia.org/wiki/%E9%A6%96%E9%A1%B5?uselang=en -> 200,not reproduced.

Even more interesting.I've tried curl.

curl -i -H 'X-wikimedia-debug: mwdebug1002.eqiad.wmnet' 'https://zh.wikipedia.org/wiki/%E9%A6%96%E9%A1%B5?uselang=en' | less

On first attempt I get <title>Wikipedia</title>while on the second attempt I get the correct 首页 - Wikipedia

To completely roll out any frontend caching, I've repeated the experiment from inside the production cluster with curl, same result - first request - reproducible, second request - not reproducible. However, I could only reproduce on wmdebug hosts. Trying to randomly query production hosts doesn't reproduce.

Viztor added a comment.Aug 8 2019, 4:04 AM

@Krinkle hm, ok, now I can reproduce, but it's interesting.
In chrome, if I clear the cookies and open https://zh.wikipedia.org/wiki/首页?uselang=en for the first time, I see the page title just "Wikipedia" - so, reproduced. If I refresh the page without clearing the cookies, sometimes it changes to 首页 - Wikipedia and then stays like that. However, one time it didn't and was stuck on "Wikipedia".
I can both reproduce and not reproduce with x-powered-by: HHVM/3.18.6-dev

If you use English as system language, then it is expected behavior given how read order of system message is handled.

Viztor added a comment.EditedAug 8 2019, 4:06 AM

<s>When you set syslang as English, the software will read MW:Mainpage-url/en, and because it is not set locally, it will read default value set in software, which is Mainpage. </s> appears to be incorrect.

Antigng added a subscriber: Antigng.EditedAug 8 2019, 12:13 PM


Collected a few problematic responses. It seems that this issue has nothing todo with the following:

  1. Cookies. Given that my code for fetching these pages send no cookies, except the one that enforces php7, to the server, cookie is not the necessary condition to get such errors. This also implies that switching the default language (which cannot be done unless by using cookies) is not necessary for reproducing errors.
  2. Client-side 304 issues, given that my code doesn't implement client-side caching at all.
  3. HHVM/PHP7 preference. All above requests were served by php7, thus the claim "PHP7.x instances are correctly configured" is not true.
  4. Varnish cache, as all above requests encountered a cache miss.


Update:
The following factors could now be excluded:

  1. Namespaces. Similar problems occur in all namespaces.
  2. Parser/parser cache. Loading pages that don't invoke parser at all (e.g. special pages, pages in "Topic" namespace) could still lead to error.

Based on these observation, I would suspect that this is a message/message-cache only issue. I've noticed that recently serveral changes were made to Message.php and MessageCache.php (T228555). Could one of them have the potential to cause these problems?

BTW, all pages listed above contain T228876-like issues, indicating they are probably caused by the same bug.

Going to https://zh.wikipedia.org/wiki/首页 or https://zh.wikipedia.org/wiki/首页?uselang=en can reproducible this error and 出自Wikipedia (T228876),but it don't show error on https://zh.wikipedia.org/wiki/首页?uselang=zh and https://zh.wikipedia.org/wiki/首页?uselang=zh-xx

Viztor added a comment.EditedAug 11 2019, 3:25 AM

Going to https://zh.wikipedia.org/wiki/首页 or https://zh.wikipedia.org/wiki/首页?uselang=en can reproducible this error and 出自Wikipedia (T228876),but it don't show error on https://zh.wikipedia.org/wiki/首页?uselang=zh and https://zh.wikipedia.org/wiki/首页?uselang=zh-xx

This is the expected behavior, not a bug, at least unrelated to this one. See T229992.

Does not seem related to caching, and not related to the mw.config variable either. It seems that MW genuinely sees this title as being equal to the wiki's main page, with all expected side-effects that brings.
Maybe it is ignoring the namespace prefix somehow?

@WDoranWMF Do you think it is appropriate to raise the priority of this bug? It has been there for weeks, with no sign of "self-fixing", and the impact size only seem to be getting bigger. Now there are also reports about the sidebar being wrong. AFAIK, any thing cause the main page of a website to display inappropriately is critical.

Wang_Qiliang added a subscriber: Wang_Qiliang.EditedAug 17 2019, 4:32 AM

Also, on zh.wikiversity, similar phenomenon. When I try to get mw.config.values.wgIsMainPage = true in [[首页]]. And When I try to get mw.config.values.wgIsMainPage in [[Project:首页]], it returns 'undefined'.

When I try to get mw.config.values.wgIsMainPage in [[首页]], it undefined. And I can get mw.config.values.wgIsMainPage = true in [[Wikipedia:首页]].

Wang_Qiliang added a comment.EditedAug 17 2019, 2:44 PM

When I check https://zh.wikipedia.org/wiki/MediaWiki:Mainpage/zh-hans , it says "Project:首页"(Wikipedia:首页). But the default value of this page's zh-cn or zh-hk version is 首页. This phenomenon could be also reproduct on zh.wikiversity. I tried to modify it, but no effect.

I do not see a reason why to raise priority of this task, given the limited impact of this problem. Also see https://www.mediawiki.org/wiki/Bug_management/Development_prioritization - thanks!

Wang_Qiliang added a comment.EditedAug 17 2019, 2:56 PM

I do not see a reason why to raise priority of this task, given the limited impact of this problem. Also see https://www.mediawiki.org/wiki/Bug_management/Development_prioritization - thanks!

On Chinese Wikipedia, the article page [[首页]] (lit. Main Page) is a serious article concerning a main page of a website, not the front page of the wiki like enwp. So, could you please reconsider the priority, thank you.

Wang_Qiliang raised the priority of this task from Low to High.Thu, Sep 12, 3:22 PM

The bug has been more serious, that it has been shown more frequently.

Wang_Qiliang lowered the priority of this task from High to Low.Fri, Sep 13, 6:35 AM

In fact, nope, due to my lack of time.

However, as you can see, it is just inconceivable to everyone on Chinese Wikimedia projects, that such a serious bug only deserves a "low priority" (literal meaning), and there is even no assignee to this task.

I am so disappointed that WMF's staff has actually shelved this ticket. "The priority field summarizes and reflects the reality and does not cause it." Yes, but isn't this bug serious? I did not get the point, but per your suggestion, I changed it back. May Chinese Wikipedia not becoming a Wikibugia.

Aklapper added a comment.EditedFri, Sep 13, 12:55 PM

@Wang_Qiliang: Please see "Why has nobody fixed this issue yet?" for more context. Unfortuantely no organization has unlimited resources to fix each and every problem quickly.