Audit all existing code to ensure that any extension currently or previously adding blobs to ExternalStore has been registering a reference in the text table (and fix up if wrong)
- Mentioned In
- T183419: Determine how to update old compressed ExternalStore entries for T181555
T106386: Compress data at external storage
- Mentioned Here
- T106386: Compress data at external storage
T107610: Setup separate logical External Store for Flow in production
rEABF42bd0d84f424: AbuseFilter: Change format of database logging/ performance
T34478: AbuseFilter not setting utf-8 flag
AIUI: the immediate space problem would first be solved by buying new hardware & moving all data to larger disks, right? Flow is not blocking that.
After that, the plan would be to recompress all existing ExternalStore entries by running trackBlobs.php and recompressTracked.php. Flow is not blocking that.
After that, we should be able to decommission the old (uncompressed) clusters as all data has been recompressed and moved over. This is blocked by Flow: since Flow doesn't store references in text, its entries would not be recompressed to the new cluster, and lost once we get rid of the old clusters.
I suggest to first move Flow's ExternalStore entries away from the shared ExternalStore clusters and into its own ExternalStore DB. The script to do that is mostly done already.
Are there any good reasons not to set up a new ExternalStore cluster specific to Flow data (and possibly others), where trackBlobs.php and recompressTracked.php don't run and move the existing Flow ExternalStore entries there?
Too many tickets to keep track of :-).
For everybody: Please note that even if new hardware is a blocker for the actual migration, things should be prepared by when it arrives (I understand this is not an easy topic, though).
This is assigned to DBAs-operations. DBAs are not going to audit any kind of mediawiki code == declined. Jforester- feel free to reopen or create a new ticket, but assigned to the right team. Not doing this could break all mediawiki content, though. CC MediaWiki-Platform-Team cc @brion as this is probably related to the revision table reworking (it just makes no sense to keep it open as is).
Not true. It wasn't assigned to anyone. This task was created as a split out of your task, T106386: Compress data at external storage, at your instigation a couple of years ago. Did you wish to decline that task instead?
DBAs are not going to audit any kind of mediawiki code
Has the need gone away?
Has the need gone away?
No, this is very much needed, but the way I use phabricator for DBA tickets is- if they are assigned to us, and I cannot do anything about them, I decline them. Anyone else can reopen and reasign them or start working on them. Otherwise that will give the wrong expectations to the reporter that it is on our backlog. There is one exception, which is if it is assigned to some other project, in which case I move it to "blocked external". Not declining it means it will be lost on my backlog! :-)
I am cool with other people using phabricator differently, in which case, just delete the DBA and #operation tags and put it back to being untriaged.
Despite my particular usage of phabricator, probably there should be a way to say: "hey, this is interesting to you and you should be aware of it but I or someone else will do it" vs. "hey, can you do this if you/when had the time? yep, I cannot say when I will be doing this, but I will own this" :-D I use a separate column for that, but it may not be clear in all cases- specially on the "support" vs. "development" way of doing things.
Generally we put them in a different column, yes – "Watching" or "External" or whatever. But it can be confusing for anyone not in the team, as it's not obvious whether the tag on the task means "Foo are going to do this, you can ignore it" or "Foo are anxious that you get this done right now!" or anything in between. :-)