+1 for removing this.
Tue, Feb 25
Mon, Feb 24
Fri, Feb 21
Tue, Feb 18
@dpifke Do you want to take this on since you're working in this area?
Thu, Feb 13
Compression seems doable. LZMA works well per https://phabricator.wikimedia.org/T235455#5837382 . arclamp-grep would have to change though; maybe grep(fname, search_string) could stream zipped log object contents to lzcat and loop through the resulting lines.
Wed, Feb 12
This vaguely reminds me of https://gerrit.wikimedia.org/r/#/c/mediawiki/core/+/238370/ . Though implementing KeyValueStore makes more sense indeed.
Tue, Feb 11
I can reproduce this with master phan as well:
Looks like hierdata/(swift|codfw)/params.yaml needs updating, along with the private puppet repo (beforehand).
Mon, Feb 10
Closing per the above patch (unless some issue remains).
I think having swift-repl manually set X-Timestamp is doable now. It would work kind of like rsync can in that regard. This also works better when the direction is switched. Right now, I assume the codfw files tend to have higher timestamps, so switching would cause pointless writes due to the new source cluster having higher timestamped files than the new destination cluster. Since the timestamp is already stored anyway, this wouldn't add any metadata.
Sat, Feb 8
Thu, Feb 6
Wed, Feb 5
I think https://gerrit.wikimedia.org/r/#/c/mediawiki/core/+/563289/ is running into this.
Tue, Feb 4
Links to old (non-current) versions due not use the parser cache. This means that rendering will always require a full parse.
Mon, Feb 3
Wed, Jan 29
Tue, Jan 28
I compressed a sample log file from today to see what kind of compression ratios we could get:
Jan 23 2020
Not seeing this in the logs anymore.
Jan 21 2020
What user impact did it cause?
Jan 20 2020
As long as there are any health checks that hit MediaWiki in codfw that involve DB access (pretty much any normal/special page view), then LoadMonitor::getServerStates is reachable (in the course of picking a DB to connect to). That seems expected to me.
Jan 14 2020
Jan 13 2020
Jan 3 2020
These errors seem to be for Special:ConfirmEmail but the patch was for Special:ChangeEmail.
Dec 21 2019
Dec 5 2019
The RedisConnectionPool patch idea seems reasonable to me.
Dec 2 2019
Nov 27 2019
Nov 13 2019
Are there any cache busting user preferences at play here?
Nov 4 2019
Oct 30 2019
Aside from the things mentioned in the above patch, the overall code looks OK to me.
Oct 24 2019
I just want it on the work board (I had a meeting with Erik/Bill) for tracking object cache review and work (we have the goal of getting CPT more involved in maintenance rather than just myself and Timo).
Oct 23 2019
Oct 16 2019
Oct 11 2019
Oct 10 2019
Indeed the logging is based on the *whole* raw unfiltered position...I should add a logstash key for the filtered one too.
Oct 8 2019
@jcrespo @Marostegui What do think of the idea of having another cluster of mysql servers set up just like the parser cache ones? That would be nice from an HA perspective and to avoid adding extra load to any existing DB cluster (e.g. objectcache table of metawiki or extension1)? Traffic would be modest given that it would start out for use for WikimediaEvents, LoginNotify, perhaps AbuseFilter stats too (see https://docs.google.com/document/d/1tX8ekiYb3xYgpNJsmA1SiKqzkWc0F-_E4SGx6BI72vA/edit#heading=h.bdt9mhl3o7k5).
Oct 6 2019
Oct 2 2019
Sep 30 2019
Not seeing this in the logs anymore.
Sep 18 2019
Seems like some kind of merge conflict.
Sep 12 2019
Sep 11 2019
Sep 10 2019
Odd, the constant seems to be there.
Sep 9 2019
So, getting this test merged depends on redoing the wikibase schema hook application order for update.php. In CI, there seems to be a problem when it interacts with Flow hooks trying to make pages.
Sep 5 2019
Should be fixed now.
Aug 30 2019
It looks like WebStart.php sets ignore_user_abort() for POSTS and the major entry points have wfTransactionalTimeLimit() set for POSTS. In the case of module_deps updates for load.php, that's on GET.
Aug 29 2019
Client disconnects (HTTP 499) are interesting...before the ignore_user_abort() in doPostOutputShutdown(), I suppose it's possible to end up with stuff like this (and long has been). https://gerrit.wikimedia.org/r/#/c/mediawiki/core/+/519741/ would help this particular case by avoiding DB writes.
I wonder if some entry point lacks proper shutdown.
Aug 28 2019
What is the value of apc.enable_cli ? I don't seem to have that problem.
I do worry about the risk of data loss if swiftrepl is also deleting files based on container list differences.
Aug 26 2019
I'd love to have a simplified version of WebRequest as a service. One that would be useful for dealing with the issue that https://gerrit.wikimedia.org/r/#/c/mediawiki/core/+/532367/ is about. Optimization hacks like https://gerrit.wikimedia.org/r/#/c/mediawiki/core/+/526801/ could be avoided too. It could be injected with pathinfo/cookie settings, but would not deal with complex encoding stuff that uses $wgContLang and so on.
Aug 25 2019
Aug 23 2019
Still, a file was only uploaded, and no other operations done...I'm not sure why the DB would commit if the file store failed in one of the FileBackendMultiwrite backends and 'replication' is 'sync'...
Isn't there a swiftrepl background process to fix this?
Aug 22 2019
Note that CdnCacheUpdate queues a purge to happen X seconds later to help deal with lag (mediawiki-config has $wgCdnReboundPurgeDelay at 11). If lag gets near that amount, then $wgCdnMaxageLagged will kick in.
Aug 21 2019
Seems to be resolved, likely by vary-revision refactoring from T226785.