Thu, Dec 5
The RedisConnectionPool patch idea seems reasonable to me.
Mon, Dec 2
Wed, Nov 27
Wed, Nov 13
Are there any cache busting user preferences at play here?
Nov 4 2019
Oct 30 2019
Aside from the things mentioned in the above patch, the overall code looks OK to me.
Oct 24 2019
I just want it on the work board (I had a meeting with Erik/Bill) for tracking object cache review and work (we have the goal of getting CPT more involved in maintenance rather than just myself and Timo).
Oct 23 2019
Oct 16 2019
Oct 11 2019
Oct 10 2019
Indeed the logging is based on the *whole* raw unfiltered position...I should add a logstash key for the filtered one too.
Oct 8 2019
@jcrespo @Marostegui What do think of the idea of having another cluster of mysql servers set up just like the parser cache ones? That would be nice from an HA perspective and to avoid adding extra load to any existing DB cluster (e.g. objectcache table of metawiki or extension1)? Traffic would be modest given that it would start out for use for WikimediaEvents, LoginNotify, perhaps AbuseFilter stats too (see https://docs.google.com/document/d/1tX8ekiYb3xYgpNJsmA1SiKqzkWc0F-_E4SGx6BI72vA/edit#heading=h.bdt9mhl3o7k5).
Oct 6 2019
Oct 2 2019
Sep 30 2019
Not seeing this in the logs anymore.
Sep 18 2019
Seems like some kind of merge conflict.
Sep 12 2019
Sep 11 2019
Sep 10 2019
Odd, the constant seems to be there.
Sep 9 2019
So, getting this test merged depends on redoing the wikibase schema hook application order for update.php. In CI, there seems to be a problem when it interacts with Flow hooks trying to make pages.
Sep 5 2019
Should be fixed now.
Aug 30 2019
It looks like WebStart.php sets ignore_user_abort() for POSTS and the major entry points have wfTransactionalTimeLimit() set for POSTS. In the case of module_deps updates for load.php, that's on GET.
Aug 29 2019
Client disconnects (HTTP 499) are interesting...before the ignore_user_abort() in doPostOutputShutdown(), I suppose it's possible to end up with stuff like this (and long has been). https://gerrit.wikimedia.org/r/#/c/mediawiki/core/+/519741/ would help this particular case by avoiding DB writes.
I wonder if some entry point lacks proper shutdown.
Aug 28 2019
What is the value of apc.enable_cli ? I don't seem to have that problem.
I do worry about the risk of data loss if swiftrepl is also deleting files based on container list differences.
Aug 26 2019
I'd love to have a simplified version of WebRequest as a service. One that would be useful for dealing with the issue that https://gerrit.wikimedia.org/r/#/c/mediawiki/core/+/532367/ is about. Optimization hacks like https://gerrit.wikimedia.org/r/#/c/mediawiki/core/+/526801/ could be avoided too. It could be injected with pathinfo/cookie settings, but would not deal with complex encoding stuff that uses $wgContLang and so on.
Aug 25 2019
Aug 23 2019
Still, a file was only uploaded, and no other operations done...I'm not sure why the DB would commit if the file store failed in one of the FileBackendMultiwrite backends and 'replication' is 'sync'...
Isn't there a swiftrepl background process to fix this?
Aug 22 2019
Note that CdnCacheUpdate queues a purge to happen X seconds later to help deal with lag (mediawiki-config has $wgCdnReboundPurgeDelay at 11). If lag gets near that amount, then $wgCdnMaxageLagged will kick in.
Aug 21 2019
Seems to be resolved, likely by vary-revision refactoring from T226785.
Aug 20 2019
Aug 19 2019
Aug 17 2019
I don't think so.
Aug 15 2019
Does this still occur?
Aug 12 2019
Per my comment above, this is the expected behavior.
It's an optional table, not installed by update.php.
Aug 9 2019
They were obsoleted by flaggedrevs_statistics.
Aug 8 2019
The remaining vary-revision instances are basic self-transclusions (https://gerrit.wikimedia.org/r/#/c/mediawiki/core/+/526157/ should handle those).
Aug 5 2019
Aug 2 2019
Aug 1 2019
Jul 31 2019
Is https://phabricator.wikimedia.org/T212881#5195101 the error that still happens or is it the read-only one too?
Jobs are fine...though this case is complicated since people want their "latest views" to be immediately reflected...so it would have to do something like WatchedItemStore.
How much of this is unique from T205936 ?
Jul 27 2019
Jul 25 2019
Jul 23 2019
I wonder if this is fixed in https://gerrit.wikimedia.org/r/#/c/mediawiki/core/+/519565/
The logs for doSelectDomain() look quite for the last 7 days.
959daa2ca44c039e72c8a9a5199d4c74dd05caba added the << $status->value = [ 'warnings' => $upload->checkWarnings() ]; >> line. It seems like checkWarnings() has all kinds of File objects inside of it potentially. Some callback could easily slip in given that.
Jul 22 2019
ObjectCache always mentioned getMainStashInstance() as "Ephemeral global storage". It was just supposed to *try harder* to be persistent than memcached (rdb snapshots, expectation that stuff can *probably* still be there a week later or so). The existence of redis evictions and consistent re-hashing on host failure making data disappear or go stale was well known at the time it was picked as the original "stash".
Jul 19 2019
JobQueueException should be thrown from push(), with nothing catching it other than MWExceptionHandler or site-specific callers. Things like RenameUser *depend* on knowing whether something enqueued or not in order to function correctly. Typically, push() should be used pre-send, before preOutputCommit, so everything would just rollback anyway. Jobs pushed after than are enqueued during DeferrableUpdates (directly or indirectly via lazyPush()); in that case, DeferredUpdates should (already) catch any exceptions (not just job queue ones) and rollback on an update-by-update bases. The exceptions are logged in the DeferredUpdates channel (previously the Exception channel).
Also, the timeout exceptions themselves where redis, not LBFactory. The later seemed to just have errors related to the improper shutdown.
Jul 18 2019
Dropping the field doesn't make sense, but dropping the whole table does. We do not use that class in production (and it is optional within MW core).