We seem to constantly be fighting a battle against flakey browser tets.
We go through a few prolonged cycles in this area which just make the situation worse all around:
**Assuming they are random, and rechecking**
- My CI fails for a seemingly flakey reason
- type `recheck` re running ALL jobs (not only the failed on), probably expending 30-60 minutes worth of execution time on CI nodes?
- the `recheck` turned it green, so the issue doesn't get flagged up investigated and fixed
**Failing to report issues & save needed details**
- Report a seemingly flakey browser test in CI at a high level, linking to logs for CI runs
- The team that needs to look at the failure may not look for a week or 2 as the ticket goes through their process
- By time the team looks at the ticket the links to CI builds are dead and the investigation is pretty hard / not worth it at that point?
- Wait for the process to repeat?
I propose that we experiment with looking at seemingly flaky / failing browser tests centrally in the jenkins logs.
I would hope that:
- We can catch "trending" issues before people would normally report them
- We can look at issues with multiple runs and logs being automatically provided to us (instead of waiting for people to report more failure in phabricator)
- We diagnose and fix the issues faster
- Everyone is happier, and we reduce the painful loops that I mention above,
Collection of data:
```
ssh contint2001.wikimedia.org
ls /srv/jenkins/builds/*-selenium-*/*/log | grep -v apache | xargs cat | grep ✖ | sort | uniq -c | sort -nr
```
And then replace `\s+(\d+).*(✖.*)\n` with `$1, $2\n`