For al least two recent Wikipedias (kcgwiki and blkwiki), and probably others, statistics in Special:Statistics and the API right after the creation/import from Incubator remain at zero pages, articles, users, files... except for active users and bots. Apparently some script or process should be run to initialize/update the data; it would be useful that https://wikistats.wmcloud.org/ could show accurate figures right away or from the next midnight update, if we think about the potential visitors looking for "new stuff". Thanks.
Ah, yes.. so this should be:
[mwmaint1002:~] $ sudo systemctl start mediawiki_job_initsitestats.service
which I have manually ran before or on request.. but sometimes we just wait for the next automatic timer run
The real fix should be that the create wiki script also runs the update command.
/usr/local/bin/mw-cli-wrapper /usr/local/bin/foreachwiki initSiteStats.php --update
or the version of that _just for the new wiki_. Running it for all also was very fast though.
I'm not sure whether that would actually do the trick. The wiki creation script runs while the wiki actually is empty. In other words, once addWiki.php completes, all-zeros in Special:Statistics are the expected behavior. It breaks a bit later, when new wikis importers populate the wiki with come content (happens a few days after addWiki.php completes).
The importers use regular importing endpoints (action=import API and Special:Import) to populate the wiki with content. WikiImporter class appears to include some code to update site stats, but apparently it doesn't work. Considering addWiki.php runs INSERT INTO site_stats(ss_row_id) VALUES (1) (which results in most columns of the table being NULL), it might be that WikiImporter only updates site stats when the value fields in site_stats are not NULL? Might be worth testing with the next batch of wikis.
Thanks! Just another suggestion: I understand in the code that the job is run at 05:39 UTC; could it be moved to around 23:45 UTC (or sooner, if it takes more than 15 minutes to complete) so that the Wikistats midnight update gets the data as fresh as possible, specially for new projects created in that day? (Well, another option could be to move the Wikistats update, if @Dzahn agrees...).
@-jem- I uploaded the change above but then realized that I already spread out the other timers that you want to sync with, by project. so it's like this currently on the other side:
'wp' : ensure => $ensure, hour => 0; # Wikipedias .. 'wt' : ensure => $ensure, hour => 2; # Wiktionaries 'ws' : ensure => $ensure, hour => 3; # Wikisources 'wn' : ensure => $ensure, hour => 4; # Wikinews 'wb' : ensure => $ensure, hour => 5; # Wikibooks 'wq' : ensure => $ensure, hour => 6; # Wikiquotes .. 'wy' : ensure => $ensure, hour => 10; # Wikivoyage .. 'wx' : ensure => $ensure, hour => 18; # Wikimedia Special 'mh' : ensure => $ensure, hour => 18; #
I'm assuming the previous time (05:39 UTC) was chosen to be during a relatively low-traffic/low-load time of day. Should that be considered in rescheduling this?
EDIT Wait a minute... 05:39 UTC is in the late/middle evening in the U.S., so maybe not!
Personally I don't know why 05:39 was chosen. Assumed it was just about spreading out all jobs randomly across the day. Digging for the original commit _might_ reveal something but would require quite some digging I expect.
@Dzahn, thanks, I assume that your changes are the best solution that doesn't involve deeper changes that would be of greater magnitude than the problem to be solved. Bot operators as me can just check the update times in Wikistats for each project family and adapt their running times, and Wikistats operators (currently that would be you) can try to add new projects near but before 21 h UTC (but it wouldn't be a big deal if not). Just one detail: you approached all Wikimedia families to 0 h UTC, except for Wikiversities...