Page MenuHomePhabricator

wikistats needs improved data and presentation for fandom
Open, MediumPublic

Description

Neither wikia overview page https://wikistats.wmflabs.org/display.php?t=wi nor tables in any format gives any result.

Overview page fails with HTTP ERROR 500.

Event Timeline

Please note that most site has been moved from wikia.com to fandom.com, see T215038

Yea, it's a known issue since many years. The number of wikia wikis was too large and i didn't have a good way to import a list of all wikia wikis and keep it updated. Also fetching stats from each one would take a really long time. Once i was told there was a contact at Wikia to talk about that but that never materialized.

I didn't know about the fandom part. Any idea roughly how many wikis exist in each domain? Are there reliable ways to get a list of all of them?

Dzahn triaged this task as Low priority.Feb 7 2019, 5:22 PM

I didn't know about the fandom part. Any idea roughly how many wikis exist in each domain? Are there reliable ways to get a list of all of them?

wowwiki would be enough for me because it is the only wikia subdomain family which is enclosed to the pywikibot families folder ;-)
(others has to be generated by a script):
https://gerrit.wikimedia.org/r/plugins/gitiles/pywikibot/core/+/f741a2575567af7ca5e2f992430772d0b111e9fc/pywikibot/families/

There are 31 subdomains of wowwiki
https://gerrit.wikimedia.org/r/plugins/gitiles/pywikibot/core/+/cfa2046e4204790448cc85145775e57eeebc94ee/pywikibot/families/wowwiki_family.py

Aha! This makes it much more doable. Yes, i am open to making a special table with just these.

I did a quick count and

+------+--------+
| http | count  |
+------+--------+
|  404 | 339979 |
|  301 |  70664 |
|  410 |     38 |
|  403 |      9 |
|  302 |      7 |
|  992 |      6 |
+------+--------+

so out of 411k wikis, 339k don't even redirect to a fandom domain. Let's drop them and then hopefully clean the rest. I backed it up.

Mentioned in SAL (#wikimedia-cloud) [2023-09-12T09:41:11Z] <RhinosF1> drop wikia where http = 404 T215534

wow, that's an insane percentage of broken ones. thank you for dropping them! I hope this also fixes the "slow to load table" issue and we can make updates work again.. and properly rename it to fandom.

So I worked out that the page is causing the browser to OOM, I moved it to generating on machine, apache OOMs.

It's 1.1 million lines of HTML when generated. bleh

@Xqt: do you have a way to get an up to date list of fandom wikis?

We simply can't display all but if we had some then we could show something

So i replaced wikia.com with fandom.com, updates running. I haven't merged the commit on gitlab yet.

I will delete all broken wikis again once we update to using fandom.com, then we'll have to look at how to limit results returned to top X wikis for now.

RhinosF1 raised the priority of this task from Low to Medium.Nov 3 2023, 8:03 PM

Change 971526 had a related patch set uploaded (by RhinosF1; author: RhinosF1):

[operations/puppet@production] wikistats:wikia: pause updates while changes are made to table

https://gerrit.wikimedia.org/r/971526

Change 971526 merged by Dzahn:

[operations/puppet@production] wikistats:wikia: pause updates while changes are made to table

https://gerrit.wikimedia.org/r/971526

I will delete all broken wikis again once we update to using fandom.com, then we'll have to look at how to limit results returned to top X wikis for now.

It is now capped at 10k wikis.

We need to look at best way for future and importing new wikis

RhinosF1 renamed this task from wikistats does not work for wikia sites to wikistats needs improved data and presentation for fandom.Nov 7 2023, 5:23 PM