Page MenuHomePhabricator

Google displays “Wikipedia” as site title for some non-Wikipedia pages
Open, Needs TriagePublicBUG REPORT

Assigned To
None
Authored By
Pols12
Oct 4 2023, 9:57 PM
Referenced Files
F45144161: Screenshot from 2024-04-08 11-27-28.png
Mon, Apr 8, 5:58 AM
F39387735: image.png
Oct 22 2023, 5:20 PM
F38911518: image.png
Oct 20 2023, 7:13 AM
F38271265: 2023-10-15_22-04.png
Oct 15 2023, 6:39 PM
F38221195: Capture d’écran du 2023-10-13 15-17-06.png
Oct 13 2023, 1:18 PM
F37984865: image.png
Oct 5 2023, 5:58 AM
F37984772: image.png
Oct 5 2023, 5:58 AM
F37984723: image.png
Oct 5 2023, 5:58 AM
Tokens
"Love" token, awarded by Effeietsanders.

Description

Steps to replicate the issue:

What happens?:
For Wikimedia projects as results, the site name displays as “Wikipedia” (or “Wikipedia Commons”).

What should have happened instead?:
It should display real project name (“Wiktionary”, “Wikidata”, “Wikimedia Commons”…) or localized version (“Wiktionnaire”, “Βικιλεξικό”…).

Other information:
Reported on Meta-Wiki and on French Wiktionary.

To Do

  • From the Google search console, use the URL inspection tool to understand where does the “Wikipedia” site name come from.
  • If og:site_name is still not in source page of Wiktionary pages indexed by Google, request manually a recrawl.
  • Check whether the home page which should host Schema WebSite structured data must be at root /; or may be located elsewhere (e.g. /wiki/Wiktionary:Main_Page) (see T120085)
  • Choose between recommended Schema:WebSite and alternative og:site_name.
  • Apply chosen solution (if Schema:WebSite is chosen, we should revert the patch which adds og:site_name to Wiktionaries and MediaWiki.org)

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes
Nikerabbit subscribed.
  1. I can't reproduce

image.png (647×710 px, 82 KB)

  1. Not in scope for the Language team either, maybe SRE?

Can repro.

image.png (408×1 px, 69 KB)

image.png (430×1 px, 69 KB)

And it's affecting more than just English and French entries.
image.png (322×1 px, 43 KB)

I submitted a feedback to Google 10 days ago but nothing has changed since then.

Removing the SRE task, that doesn't seem related.

Also tagging WMF-Communications for review, since this issue increase confusion with Wikimedia Trademarks.

Today, the French Wiktionary link uses Wikipedia as title. The French Wikipedia link uses a weird logo. And the English Wikipedia looks correct...

Capture d’écran du 2023-10-13 15-17-06.png (758×782 px, 138 KB)

We have the same problem (No wiki favicon next to results) for fawiki. Now I don't recall if the favicon was shown before (honestly I never paid attention) but according to one of Fawiki users, this was the case, and the favicon has been replaced with the "globe" logo for some time now.

2023-10-15_22-04.png (198×717 px, 50 KB)

I have the same problem for numerous Wiktionary language variants, I've got it from wiktionary bread search

image.png (379×853 px, 51 KB)

SRE is not responsible for this.

It's worth noting that mediawiki.org is also labeled as Wikipedia:

image.png (611×910 px, 100 KB)

R4356th renamed this task from Google displays “Wikipedia” as site title for some Wiktionary pages to Google displays “Wikipedia” as site title for some Wiktionary and MediaWiki.org pages.Oct 22 2023, 7:25 PM

@SCherukuwada, seeing T302625 and that you are working on T325607, could you please take a look here and at T349361? Thanks.

That bit you're seeing is called the "Site Name", as distinct from the page title. The idea behind it seems to be to tell you what site you're looking at (without needing to look at the URL itself, or relying on the page title to include the name of the site).

The way google processes site names is documented on the Google Search Central site.

It seems that when a page doesn't specify a site name, Google simply "guesses". From what I can see, if it can't guess, it just uses the domain name as the site name. A number of Wiki projects end up being guessed incorrectly as "wikipedia".

Recommendation:

  1. I'll request partnerships to talk to Google about this. The most likely fix is that they'll default to wiktionary.org or wikisource.org if they fix their guessing.
  2. If we want to control this on our end, we need to set og:site_name (or use some of the other recommendations on Google's page) to wgNoticeProject or something, but I defer to those who know better. @ovasileva ?
  1. If we want to control this on our end, we need to set og:site_name (or use some of the other recommendations on Google's page) to wgNoticeProject or something, but I defer to those who know better. @ovasileva ?

That page you linked says "Our site name system will also consider content in og:site_name, <title>, heading elements, and other text on a home page." and title (as well as og:title) is already set on Wikimedia wikis though.

The next sentence says that the preferred method is the JSON-LD one, though.

For og:title, it is defined as a property of a page while og:site_name is a property of a site as a whole. It seems to be a good reason not to use og:title for the intended use they document on that page.

Partnerships just told me they will reach out to Google soon. We'll post updates here.

Change 969396 had a related patch set uploaded (by Pols12; author: Pols12):

[mediawiki/core@master] Adds og:site_name meta tag on all pages

https://gerrit.wikimedia.org/r/969396

Partnerships just told me they will reach out to Google soon. We'll post updates here.

Has there been any news in the last month?

The patch from @Pols12 looks like a useful addition to the codebase regardless of the discussion here but I have asked on the Gerrit patch that we separate the decision of whether to include an og:site tag from making it possible to add one.

Change 981636 had a related patch set uploaded (by Pols12; author: Pols12):

[operations/mediawiki-config@master] Make wiktionary and mw.org provide og:site_name

https://gerrit.wikimedia.org/r/981636

Change 969396 merged by jenkins-bot:

[mediawiki/core@master] Skin: Allow og:site_name meta tag

https://gerrit.wikimedia.org/r/969396

Change 981636 merged by jenkins-bot:

[operations/mediawiki-config@master] Make wiktionary and mw.org provide og:site_name

https://gerrit.wikimedia.org/r/981636

Mentioned in SAL (#wikimedia-operations) [2023-12-20T14:04:01Z] <lucaswerkmeister-wmde@deploy2002> Started scap: Backport for [[gerrit:981636|Make wiktionary and mw.org provide og:site_name (T348203)]]

Mentioned in SAL (#wikimedia-operations) [2023-12-20T14:06:18Z] <lucaswerkmeister-wmde@deploy2002> pols12 and lucaswerkmeister-wmde: Backport for [[gerrit:981636|Make wiktionary and mw.org provide og:site_name (T348203)]] synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)

Mentioned in SAL (#wikimedia-operations) [2023-12-20T14:19:56Z] <lucaswerkmeister-wmde@deploy2002> Finished scap: Backport for [[gerrit:981636|Make wiktionary and mw.org provide og:site_name (T348203)]] (duration: 15m 54s)

The recommended fix is providing schema:name at home page.
However, “By home page, we mean the domain or subdomain level root URI.” (GSC)
Since our root URIs return HTTP 301 (moved permanently), I don’t know whether we can use the redirection target to place schema properties.

So the merged patch rather include og:site_name on all pages; that should be a fallback source for Google. I have no access to Google Search Console, but I expect Wiktionary and MediaWiki.org pages to be re-indexed soon. (i.e. less than 2 weeks).

Before 10 January, if Wiktionary and MediaWiki.org results ever have correct site name on Google, we can consider applying the patch to all wikis (@Lucas_Werkmeister_WMDE reported “Wikipedia Commons” site name for Commons; I just noted “Wikipedia” site name for Wikidata too).

Else, we should consider a way to apply recommended solution with schema properties. That will maybe require to go forward with T120085 first.

Since it hasn't been mentioned explicitly: equally a problem for Wikimedia Commons.

Pols12 renamed this task from Google displays “Wikipedia” as site title for some Wiktionary and MediaWiki.org pages to Google displays “Wikipedia” as site title for some non-Wikipedia pages.Dec 27 2023, 10:53 PM
Pols12 updated the task description. (Show Details)

Unfortunately, my patch seem to have not fixed the issue: wiktionnaire Jeux paralympiques still displays “Wikipedia” as site name, whereas source of cached page well contains <meta property="og:site_name" content="Wiktionnaire">.

So we must need to understand where does the site name come from.
We need help from someone with Search Console access.

We also should check whether structured data for home page may be located at root redirection target (i.e. /wiki/Wiktionary:Main_Page on English Wiktionary).
I found a well-Google-indexed wiki which uses site name different from domain: איןציקלופדיה (both schema:name and og:site_name are used). So root being a redirection seems not being a blocker.

Search-Console-access-request is not the right tag, but maybe the team who grants that access can help us to reach someone who has that precious access. I am unfortunately not enough trusted to request this access myself.

Bodhisattwa subscribed.

Adding Wikidata project tag too

Screenshot from 2024-04-08 11-27-28.png (149×642 px, 27 KB)

Just escalated to the folks talking to Google one more time.