Suggested to me by @Bawolff, though I've expanded the scope.
We should have a script/bot that regularly tries to exploit potential security issues (e.g. anything related to media files, shelling out, etc.) and probably past ones for regression testing. I'd imagine this script lives in toolforge/wmcloud somewhere, runs regularly, firing off emails if something fails. Ideally it would not actually edit the wiki, instead using preview or other methods to avoid making public write actions.
E.g. for T257062 we could try to action=parse a malicious <score> block and see if it errors out. I'm sure there's a bunch of other things we could find going through past security issues or looking at places we've added preventative hardening measures.