Page MenuHomePhabricator

Url "veaction=edit" should be marked as noindex, nofollow
Closed, ResolvedPublicBUG REPORT

Description

Steps to replicate the issue (include links if applicable):

  • Search site:"zh.wikipedia.org" "veaction=edit" in google

What happens?:

Item: https://zh.wikipedia.org/zh-tw/EmEditor?veaction=edit
Item: https://zh.wikipedia.org/zh-tw/UltraEdit?veaction=edit
Item: https://zh.wikipedia.org/zh-tw/File:How_to_Edit_Wikidata.pdf?page=1&veaction=edit
Item: https://zh.wikipedia.org/zh/%E9%A6%99%E6%88%91%E7%BE%8E%E7%AB%99?veaction=edit
Item: https://zh.wikipedia.org/zh-hant/%E5%9C%A3%E6%B4%9B%E6%9C%97%E8%BF%AA%E6%99%AE%E6%9C%97_(%E5%90%89%E4%BC%A6%E7%89%B9%E7%9C%81)?veaction=edit&section=2
......

What should have happened instead?:
Those page should be marked as noindex, nofollow

Software version (skip for WMF-hosted wikis like Wikipedia):

Other information (browser name/version, screenshots, etc.):

Related Objects

StatusSubtypeAssignedTask
OpenFeatureNone
ResolvedBUG REPORTmatmarex

Event Timeline

SunAfterRain renamed this task from Url "veaction=edit" should mark as noindex, nofollow to Url "veaction=edit" should be marked as noindex, nofollow.Oct 1 2022, 7:18 AM
SunAfterRain updated the task description. (Show Details)

@ovasileva: can you foresee any unexpected consequences with the Editing Team doing what this task is asking for, to make it so search engines omit links to open the visual editor from appearing within search results?

@ovasileva: can you foresee any unexpected consequences with the Editing Team doing what this task is asking for, to make it so search engines omit links to open the visual editor from appearing within search results?

Yeah, no concerns from my side so long as we're sure the noindex will only affect VE pages,

Change 863035 had a related patch set uploaded (by Bartosz Dziewoński; author: Bartosz Dziewoński):

[mediawiki/extensions/VisualEditor@master] Don't index VE edit pages

https://gerrit.wikimedia.org/r/863035

You'd think the <link rel="canonical" would be sufficient...

Change 863035 merged by jenkins-bot:

[mediawiki/extensions/VisualEditor@master] Don't index VE edit pages

https://gerrit.wikimedia.org/r/863035

I am trying to better understand. I would expect that it isn't displayed in the search result.
https://photos.app.goo.gl/tsHZ7r4iXNLC5W6o7 is what I get. Is this expected?

If not, what would the best way to test this be?

Cc: @Esanders @matmarex

You're correct, but it might take a while for the search results to disappear – this is outside our control. Let's give it a few weeks.

I verfied that when visiting e.g. "https://zh.wikipedia.org/zh/%E9%A6%99%E6%88%91%E7%BE%8E%E7%AB%99?veaction=edit", I now get <meta name="robots" content="noindex,nofollow,max-image-preview:standard"/> in the output, so hopefully Google will pick this up as well eventually.

I had a look again, and search results are still displaying many pages like this.

For example: https://www.google.com/search?q=site%3A"zh.wikipedia.org"+inurl%3A"veaction%3Dedit"

image.png (2×3 px, 480 KB)

Some of the results have the "No information is available for this page. Learn why" message, linking to https://support.google.com/webmasters/answer/7489871. That page states that Google might be unable to update its results because it is blocked in robots.txt:

Use "noindex" on your page. If using noindex, you must also remove the robots.txt rule that blocks the page to search engines. Sounds strange, but Google needs to be able to read the page in order to see your "noindex" instruction. Learn about robots.txt here.

And indeed https://zh.wikipedia.org/wiki/MediaWiki:Robots.txt blocked Google from indexing those pages. I removed it now: https://zh.wikipedia.org/w/index.php?title=MediaWiki:Robots.txt&diff=75475776&oldid=73999017. Let's see if that fixes the issue.

It looks like the pages are being reindexed… very slowly. The same search now gives me "852,000" results (it was "1,270,000" 3 weeks ago).

It says "964,000 results" now, but if I try to go to the last page, there are actually only 81 results in total.

It seems like the results are still slowly getting reindexed. Let's give it some more time before we close.

We're down to almost zero results (fewer than one page). Hopefully this means that they will almost never show up in normal searches. I'm not sure why a few of them are stuck, but I don't think it matters. Let's call this resolved.

matmarex claimed this task.
matmarex added a subscriber: VPuffetMichel.