Moving discussion to here for the record:
Some time ago we noted that while we have some reasonable unit tests in some repos, and some reasonable browser tests in some repos, we lack any consistent tests at the API or services level.
Having such tests would be useful not only as regression tests in the test environments, but also in order to monitor the availability of key services in the production environments. UploadWizard on Commons is a case of particular note, and the upload function in particular for UW: http://commons.wikimedia.org/w/api.php?action=help&modules=upload with the "stash" parameter.
Matt Flaschen is also doing things along those lines: https://gist.github.com/mattflaschen/7904894