Tue, Jan 16
Thu, Jan 11
Sat, Jan 6
The description is a bit unclear: Shall it be enabled for hewiki only initially (and if so for how long) or hewiki + cawiki, trwiki, …?
Fri, Jan 5
I think this is all fine actually :)
Thu, Jan 4
Ran again, see P6522#36749.
hoo@terbium:~$ mwscript namespaceDupes.php --wiki eswiki --fix 0 pages to fix, 0 were resolvable.
Mon, Jan 1
Wed, Dec 27
Dec 17 2017
Dec 8 2017
Dump crons are back (af1f7dabee931dbdb7366b7be1f93698f2c56108), so I expect dumps to arrive in time next week!
Dec 6 2017
This was due to f77b47c5e30c297a43e4ce354c1dfb4a29a638fa being deployed to early, it should have not been in wmf11.
Dec 4 2017
Downgrading this, as this seems to be working again.
So I merged the patch that makes sure we only actually collect this data if not in CLI mode.
Dec 2 2017
That's definitely the case here, given that the sample rate is only applied later on.
I don't see a clear way to fix, thus reassigning to Aaron.
So I finally managed to track this down:
Dec 1 2017
Regarding the appsevers, the canary ones are indeed (mostly) enough. But for the Varnishes, having access to the actual ones would be very nice… I don't see that much value in the canary caches for me here.
Reverting 795350da2e5c49efa66c1950bd034f46aeb3768a also doesn't seem to make any difference.
When changing CacheRetrievingEntityRevisionLookup to always use it's underlying EntityRevisionLookup ($this->lookup) on mwdebug1001, both the call with --no-cache and without it show very similar memory usage behavior.
Nov 28 2017
So I just ran one JSON dumper with --no-cache and one without on mwdebug1001 (with HHVM)… the results are baffling:
Nov 27 2017
I just tested this briefly locally by dumping my wiki several times (up to 100 times), but I couldn't see any kind of memory leak (but again, I dumped the same entities several times).
Not a single entity managed to get dumped since I created this ticket more than 7h ago, thus I killed all related processes now.
Given that we will have a new dump this week anyway, I wont bother re-starting it manually.
Nov 21 2017
Nov 20 2017
Both scripts look fine again and the dumpers are running… sorry for the mess :/
Nov 16 2017
This at least breaks the client link item widget, also probably user scripts.
The new version works on hhvm as well as on Zend. We can switch this back to hhvm now…thanks for tackling this.
Nov 14 2017
Personally I would like to avoid running them over the weekend… but that's not a show stopper here, I just want to keep the number of moving parts down during the weekend.
From next week on this should be ok again.
Nov 13 2017
Nov 10 2017
Duplicate of T177486: [Tracking] Wikidata entity dumpers need to cope with the immense Wikidata growth recently? I'm aware of these huge fluctuations in time, but wasn't able to look into this in detail, yet.
Nov 5 2017
Note: I just also found T179793: Consider dropping the "wb_items_per_site.wb_ips_site_page" index while looking at this… maybe this can be done at once?!
Giving the size of the table, changing this shouldn't be overly horrible. It's a fair bit of migration work… but I assume doing this for maintenance queries and consistency is worth it.
The second query can be expressed as:
Nov 2 2017
Nov 1 2017
Run time for the full TTL dump (the data diff is from the gzipped file):
Oct 31 2017
(Probably) due to the DataModel updates the current JSON dump was created in just 25 hours, compared to ~34-35h last week. (This is data from one run only, so not overly reliable… but the difference is huge)
Well, we could allow this, I guess… but we should at least set a canonical URL (or one per output?) as header (we can't put it in the html here, as there's none).
Oct 30 2017
I looked at this again earlier today and there are actually a lot of requests to APs on bnwiki… all coming from Facebook IP ranges (ipv6), but with different UAs and organically looking patterns.
This is currently running into timeouts from time to time:
This has been enabled (again) and this time it's here to stay! \o/
While T178247: Use a retrieve only CachingEntityRevisionLookup for dumps will certainly make the dumps much faster, it will only do so (noticeably) on HHVM. This is because we split the cache between HHVM and Zend (see below), thus the (currently) Zend dumpers wont profit from the cache which is probably mostly populated in the HHVM version of the cache (as all app server run HHVM).
There are some other maintenance scripts using Zend which might also write into this cache… so maybe this will still help something, though.
Oct 29 2017
The above change will allow us to (almost) increase our maximum throughput by a factor of 8, thus should fix these issues for now at least.
There are actually a lot of changes related to zhwiki happening:
This just happened again, zhwiki got backlogged by 8h+.