= Summary
Currently the JavaScript unit tests for all extension and skin repos are run together on every single commit that enters the integration pipeline. We basically use QUnit for integration testing. Yet Selenium (despite having its own problems) is meant for integration testing and has been adopted in many repos. To make QUnit usable for integration testing we load a MediaWiki instance and launch a web browser which runs all the unit tests in Special:JavaScript/qunit
The proposal here is to stop this practice and move towards a setup where JavaScript unit tests are forced to be written in such a way that they:
1) are run in isolation without side effects from unit tests in other extensions.
2) they can be run from the command line quicker, without any setup steps (MediaWiki, LocalSettings edits)
Given discussion on this ticket, we can retain the Special:JavaScriptTest/qunit/plain mode for true integration tests (which will hopefully run a much smaller set of tests).
== Problem statement
I argue that our current usage of QUnit in this way is problematic for several reasons.
1) Most of the tests run in QUnit are not integration tests. In fact, this is a nuisance to most developers. Currently a badly written unit test that touches global variables can break another unrelated extension. This happens frequently and can grind all JavaScript development to a halt for a day in certain circumstances. E..g T214804 (but countless more examples) #jenkins-failure (T218172)
2) Running QUnit tests in this way is not catching real integration errors. Most integration errors are being uncovered these days using Selenium
3) The fact we have to launch a browser and MediaWiki instance makes these jobs slow, particularly when run locally on an install with multiple extensions needed to function. In MobileFrontend headless QUnit tests run in 10s, compared to 2 minutes on Jenkins (when all extensions are also run)
4) The way we run QUnit is incompatible with most Node.js tooling. It is very difficult to add code coverage for JS when we run tests in this way.
5) The current infrastructure limits what we can test. To test a function it has to be made publicly accessible which usually involves exposing it unnecessarily on the `mw` global object, which adds unnecessary noise.
== Implementation proposal
To do this we would follow the path taken in MobileFrontend and Popups. MobileFrontend is currently running its unit tests in two modes - a local `npm run test:unit` and the current Special:JavaScript/qunit/plain method.
[] All core code would be rewritten to use module.exports and requires using ResourceLoader packageFiles. I have a proof of concept describing a backwards compatible migration. See: https://gerrit.wikimedia.org/r/#/c/mediawiki/core/+/487168/
[] The majority of tests in MediaWiki extensions/core would gradually be rewritten using a new node library mw-node-qunit which would be run inside `npm test` which will be provided by the reading web team and provide the minimal mediawiki JS environment needed for testing. It will force the user to write tests in a sane way, by forcing them to stub calls to mw.msg and mw.config amongst other things that can impact the global state and have side effects.
[] Code coverage reports for core JavaScript would be published as a result of this project
[] The current test runner is kept for running true integration tests in QUnit
[] If needed, support would be added for running tests inside Special:JavaScript/qunit locally for debugging purposes.
= Q and A / talking points
**Would Special:JavaScript/qunit disappear altogether?**
Maybe. There might be value in using qunit for a small set of unit tests. I would argue that Selenium despite it's problems would be a better approach for ensuring compatibility between different extensions.
**Wouldn't we lose protection against browser specific bugs?**
I am not aware of any situations we have caught a regression by running our unit tests in different browsers e.g. Firefox/Chrome/Internet Explorer. While a nice idea, I don't see any evidence that we would catch any issues. It seems hypothetical but not practical. The majority of issues in the mobile website are caught via Selenium (I can collect data if we really need this). Selenium seems a much better way to catch these problems. Unit tests tend to prevent errors entering the codebase and protecting known errors from reappearing in the codebase. Unit tests for example did not help us prevent a recent regression in mobile T215536 while Selenium did.
[1] If necessary, we can retain the use of QUnit for integration testing, but only for a small and limited set of tests.