Page MenuHomePhabricator

Stop Google from returning old obsolete wikimediafoundation.org job postings in search results
Closed, ResolvedPublic

Description

Google Wikimedia Chief technology officer job opening.

One of the top results is a job opening on wikimediafoundation.org from 2009! These 150 old job openings ranks higher than our current job openings on Greenhouse, and anyone who follows a link to them may think the job opening is closed and give up.

Maria O'Neill confirmed they're all obsolete so on guillom's suggestion I added __NOINDEX__ to Template:Job opening status. But it has no effect: - the pages aren't in Category:Noindexed_pages and their source doesn't have <meta name="robots" content="noindex,follow" />. I believe this is because wmf-config has

$wgExemptFromUserRobotsControl = array_merge( $wgContentNamespaces, $wmgExemptFromUserRobotsControlExtra );

so NS_MAIN is exempt from NOINDEX doing anything.

Seems the fix could be

  • Ask a wikimediafoundation admin to delete all 150 old pages.
  • Someone move all 150 to a namespace where NOINDEX has an effect, e.g. move each to a subpage of its talk page.
  • Fiddle with wmf-config so that NOINDEX has an effect in the main namespace of wikimediafoundation.org.
  • put some other HTML in the template that discourages Google. (Google already does not show the template's "This (old) job opening has long been closed" in its snippet.)

I don't know if there's an administrators noticeboard on wikimediafoundation. I'll link to this from https://meta.wikimedia.org/wiki/Foundation_wiki_feedback

Event Timeline

Spage raised the priority of this task from to High.
Spage updated the task description. (Show Details)
Spage added subscribers: Spage, gpaumier, Dzahn.

or ..

  • update the wiki pages with the 2015 info and make them relevant again; they have much higher Google score and why would they have to show outdated info or we would have to delete anything? We already link from there to Greenhouse, how about just copy/pasting the current info and keeping the link too. (I said this after seeing the CTO search for 2009 and a link to the same thing from 2015 above).

I don't know if there's an administrators noticeboard on wikimediafoundation

lol

Just to confirm your suspicion re. wgExemptFromUserRobotsControl though:

krenair@tin:~$ mwscript eval.php foundationwiki
> var_dump( in_array( NS_MAIN, $wgExemptFromUserRobotsControl ) );
bool(true)

That will indeed prevent main namespace pages from using NOINDEX

We can easily create an Archive: namespace for such pages where NOINDEX will be active (in addition to the signal sent by an "Archive" prefix) and mass-move the pages there with a bot.

Or simply remove NS_MAIN from $wgExemptFromUserRobotsControl on wmfwiki

Deskana lowered the priority of this task from High to Lowest.Dec 5 2015, 7:06 AM
Deskana moved this task from Needs triage to Search on the Discovery-ARCHIVED board.
Deskana subscribed.

Updating project tags due to T200951. Probably needs rechecking.

Is this still an issue? We have submitted info on the new site to Google, which in theory should have helped correct this.

Is this still an issue? We have submitted info on the new site to Google, which in theory should have helped correct this.

Yes, the steps to reproduce in the task description still reproduce the problem.

We have made a number of updates to Governance Wiki - and as a result, I believe this issue is now resolved (please reopen if you find results indicating this is not fully resolved). The search provided in the description now (more or less) provides more links we would want and does not display older content from foundation.wikimedia.org.