https://gerrit.wikimedia.org/r/#/c/372383/ demonstrates that a single change to Parsoid's output HTML can break dozens of tests in MCS. This feels like it can be avoided.
While an integration test is useful for catching these changes, only one test should have failed in this situation and should have specifically been testing against the HTML output.
I suggest we simplify these tests either by converting them to unit tests or using smaller articles.
We should discard any tests that are not adding any additional value.