Fri, Sep 29
Folks, here's a plan of action.
Fri, Sep 15
Wed, Sep 6
Seddon and I are meeting on Friday. We'll have a concrete action plan (or
the beginnings of one) to share on Monday.
Aug 24 2023
We're meeting with them in the next couple of weeks to troubleshoot our scraping problems. Will report back once we learn more.
Aug 16 2023
Manager here: I approve.
Aug 9 2023
This has been reported to Google. We're waiting for them to get back.
Aug 8 2023
@Jdlrobson Did the frontend standards group meeting come to a conclusion, or did it bring up any new insights not already shared in this ticket?
Aug 2 2023
Jul 10 2023
Thanks for all the inputs.
Jul 7 2023
@elukey Do we have another idea on the table aside from asking a team of Android devs (3) to maintain a recommendations service?
This has been granted for a period of one year.
Jun 26 2023
@Soda I need a gmail account to provide access to.
Jun 21 2023
OK, having talked to some folks in the Enterprise org and other teams and having eliminated a few possible problems, the one I'm investigating now is the possibility that for some reason Google's bot is ratelimiting itself. I'll continue to post any findings here.
Jun 5 2023
Thank you for the supporting links.
Jun 2 2023
Here are some unindexed articles (confirmed from search console and from Google). I came upon them by simply hitting "Zufällige Seite" (Random Page) on de Wikisource and checking if the resulting page is indexed at all.
@Xover That's very useful to know.
@Soda Yeah Navboxes would indeed have helped.
@Krinkle Could you please answer the question I had above when you have a chance?
Jun 1 2023
As far as I can tell, a lot of the pages that aren't appearing in the index are simply not linked to from within the Wiki. There are no sitemaps any more, and Special:Lonelypages is uncrawlable because there's a robots.txt rule blocking all Special: pages from being crawled.
May 26 2023
May 24 2023
I think we can close this task for now.
May 23 2023
If the volunteer is given access to Search Console data, they will be able
to examine the following information:
- which queries lead to search results on Google that have Wikis in them
- which results get clicked on by users
- the total query volume that has search results from the Wikis
- Whether a given page from a Wiki property has been indexed or has
problems preventing it from being indexed
- a breakdown of the above information by device platform, country, and
May 18 2023
Manager approval if needed: approved
May 17 2023
I've responded on the ticket with the volunteer. I'll handle it once they get the NDA and C-level approval out of the way.
May 16 2023
Please assign this to me once C-level approval and NDA have been taken care of.
May 15 2023
No strong opinions either way, but it seems to me that the examples that @pmiazga provides aren't all user types. "admin", "group:something", or "en_wiki:admin" are not really user types. They're memberships or privileges, which can grow and change as needed. Isn't there another way to represent those? I would imagine that a user type is something that's largely permanent, while privileges (through memberships) are expected to change.
May 11 2023
Silly question: @Krinkle Given that robots.txt has a Disallow: /wiki/Special: rule, how do search engines read the LonelyPages or RecentChanges pages? As far as I can tell these pages aren't being indexed. Am I missing something?
Apologies for the ridiculous delay. I have Wikisource search console access now and am looking at it.
Apr 26 2023
Mar 20 2023
WMF Staff who will need this on an ongoing basis. No expiry.
Mar 10 2023
Perhaps @Dbrant knows? I remember Jdlrobson mentioning that they did some work together on this.
Feb 15 2023
Rather than speculate on this one feature in isolation, I recommend letting @ovasileva decide and prioritize based on what else needs doing. She likely to have a more holistic view and more insights from the community.
Feb 3 2023
Just had a conversation with @Catrope . I'm comfortable owning the risk and understand what that entails.
Jan 20 2023
That is not the case. Quoting from T326816 (please read for details):
Jan 17 2023
Please note that any changes to this extension might first need to be
discussed in T326956.
Jan 13 2023
Dec 21 2022
I've mostly focused on search performance and stats for the Wikipedias and haven't had a chance to set up and build an understanding of where we are with Wikisource yet. That's why I don't have an immediate answer for this problem.
Dec 16 2022
The vendor replied with numbers over the last two days. Here are some numbers that might help decide on a strategy. There is a new dump of IPs every day. Between the last two successive dumps (today and yesterday) there is:
- 10-12% daily change
- 1.7M New entries
- 2M Aged off entries (i.e. dropped/removed)
Dec 15 2022
Thanks for replying, @Ladsgroup
Are DBA and SRE folks aware that this entire database will essentially be wiped every one or two days and recreated from a dump? Does that complicate things in any way?
Dec 14 2022
Apologies, I don't have access to wikisource. @mpopov does probably.
Dec 6 2022
Nov 8 2022
Nov 2 2022
Oct 24 2022
Trying to summarize the discussion above, please point out where incorrect:
Aug 29 2022
Also granted access to ptwiki and hiwiki (desktop + mobile) as requested by @ovasileva
Aug 26 2022
English has also been added. Thanks for the action, @jcrespo. Noted, will notify the on-duty next time. Modified the wiki accordingly.
I've granted access. @jcrespo could you please add the expiration date to maint-announce and close the ticket?
Jun 1 2022
Access has now been revoked.
May 23 2022
Manager is OOO.
Skip-level Manager here, approved (if needed).
May 19 2022
May 13 2022
@phuedx Was the exclusion of is_bot intentional?
Apr 29 2022
Just checked, these are safe to delete.
Apr 26 2022
Option 1: "Inline after the lead section" has some advantages.
- It resembles the previous state and allows the menu to go back to its previous narrower form (now that it doesn't have to match widths with the ToC which needs to be wider). This will probably leave people that like narrow windows largely unaffected.
- Yes, in that case, there's no floating ToC any more, but then if the window is that narrow, there's no room for one anyway.
Apr 20 2022
Apr 11 2022
I've given the above-mentioned e-mail address access to the two English
Wikipedia domains (en.wikipedia.org and en.m.wikpedia.org).
@jcrespo I've been involved in this discussion so I know what's going on here. I've updated the ticket to reflect what they need. I can take care of providing access; how do I put an entry in the maint-announce calendar to revoke access on the expiry date?
Mar 28 2022
That's _very_ interesting. Thanks for doing this! Would it make sense to just test this theory a bit more?
Mar 21 2022
Mar 15 2022
Mar 10 2022
Works as expected, thank you.
Update: @AndyRussG already has access to Bing, as of yesterday. I'm working on a process to follow if/when more people request access.
Mar 9 2022
Apologies, this fell through the cracks somewhat. I'll reply conclusively
before the end of this week.
Mar 8 2022
@JMeybohm Given that we didn't actually add each language domain one by one, there should be, a "wikipedia.org" entry listed as a "Domain Property" along with all of the per-language Wikipedias on Search Console. When you select that one and add me as an "Owner" that should do it. There isn't an actual "Administrator" ACL level as far as I can tell - it's just "Owner", "Restricted", or "Full".
Mar 7 2022
It got a bit trickier.
I confirm that Bing.com verification has worked properly. However, for Yandex it seems they need the TXT entry to be under www.wikipedia.org and not wikipedia.org. Sent out patch https://gerrit.wikimedia.org/r/c/operations/dns/+/768664 to that effect.
Mar 4 2022
Just had a discussion with @jcrespo. To understand what each of these webmaster consoles provides and what ACLs and such they support (so that we can have a process around giving access to it), we want to first add a few domains to each and learn more about the platforms. Stand by for a DNS patch.
Feb 25 2022
You're absolutely right to be concerned about traffic from search engines. That said, I'm familiar enough with how this works to be comfortable owning it, and my PM counterpart and I (and a half dozen or so other people at the Foundation including @AndyRussG) gaze at this data often enough that we'd know if something were amiss, and we'll obviously be watchful if we decide make any changes at all.
Just filed https://phabricator.wikimedia.org/T302617 to start discussing domain ownership verification on various consoles with SRE. I'll be unavailable for a few days starting now, but I've already had a discussion with @AndyRussG about this.
In case someone's wondering, DuckDuckGo doesn't actually have a webmaster console. Strange.
Feb 24 2022
I'm working on this, please stay tuned on this ticket; ETA tomorrow.
Feb 21 2022
Feb 17 2022
Just did. All good, thank you and sorry for the trouble!
Feb 14 2022
My new public key is here: