Page MenuHomePhabricator

Enable API integration tests in CI for MediaWiki core
Open, MediumPublic5 Estimated Story Points

Description

Requirement: run API integration tests on changes against mediawiki core (at least as gate jobs).

Status quo: We have API integration tests for core in the mediawiki/tools/api-testing repo. We also have a CI setup in that repo.

To do:

  • Package mediawiki/tools/api-testing
  • Update Quibble to search for api-testing npm package for core & extensions
  • Update documentation

Once all this is up and running, we can move (most) tests from api-testing/test to mediawiki/core/tests/mocha/.

Related Objects

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes

@zeljkofilipin suggested to move only the tests proper into the core repo, and leave the framework in a separate repo. We'd publish the framework as an npm package and add it as a dependency to core's package.json file. We'd also add a command for running these tests to core's package.json file, and trigger them from quibble, just like we do for selenium.

Wrinkles to consider:

  • we still need to generate the list of extensions to test. We can use GenerateMochaConfig, but we could also follow the approach we use for selenium - namely, listing all extensions that mention the relevant npm package in their own package.json.
  • what do we name the test directory. "mocha" seems like the obvious choice, but then, selenium tests are also based on mocha. "api" isn't a good fit either, since we also have api tests in phpunit (and qunit?).

Once api-testing is made a npm package, each extension interested in having such tests would add the module to its package.json. Then we can establish a convention that there must be npm script defined. Quibble can then process each extension and run the suite if that script is present.

That is what we have done for the Selenium suite. Each extension depends on a wdio-mediawiki module and define a selenium-test script. Quibble thus do something such as:

for project in projects:
    ...
    if repo_has_npm_script(project_dir, 'selenium-test'):
        self.run_webdriver(project_dir)

For extensions to test together, we have the wmf-quibble* jobs which come with some arbitrary list of extensions. But we can surely craft a new job that tests a different set of extension, repositories would then be promoted to participate in that shared integration testing job.


what do we name the test directory. "mocha" seems like the obvious choice, but then, selenium tests are also based on mocha. "api" isn't a good fit either, since we also have api tests in phpunit (and qunit?).

I can imagine people using an alternative to mocha, for example cucumber or jasmine, but I am nitpicking :] I guess tests/api-testing would be self explanatory and matches the project name / purpose quite well :]

Change 553204 had a related patch set uploaded (by Clarakosi; owner: Clarakosi):
[mediawiki/tools/api-testing@master] Refactor api-testing in preparation for packaging

https://gerrit.wikimedia.org/r/553204

Change 553204 merged by jenkins-bot:
[mediawiki/tools/api-testing@master] Refactor api-testing in preparation for packaging

https://gerrit.wikimedia.org/r/553204

WDoranWMF triaged this task as Medium priority.Dec 3 2019, 4:50 PM
WDoranWMF set the point value for this task to 5.

Change 554571 had a related patch set uploaded (by Clarakosi; owner: Clarakosi):
[integration/quibble@master] Update Quibble to use api-testing npm package

https://gerrit.wikimedia.org/r/554571

Change 556405 had a related patch set uploaded (by Clarakosi; owner: Clarakosi):
[mediawiki/core@master] Add API-testing to core

https://gerrit.wikimedia.org/r/556405

Change 554571 merged by jenkins-bot:
[integration/quibble@master] Update Quibble to use api-testing npm package

https://gerrit.wikimedia.org/r/554571

I'll do a quibble release/etc. run.

Change 562918 had a related patch set uploaded (by Jforrester; owner: Jforrester):
[integration/quibble@master] Release Quibble 0.0.40

https://gerrit.wikimedia.org/r/562918

Change 562918 merged by jenkins-bot:
[integration/quibble@master] Release Quibble 0.0.40

https://gerrit.wikimedia.org/r/562918

Change 562959 had a related patch set uploaded (by Jforrester; owner: Jforrester):
[integration/config@master] dockerfiles: Create images for Quibble version 0.0.40

https://gerrit.wikimedia.org/r/562959

Change 562960 had a related patch set uploaded (by Jforrester; owner: Jforrester):
[integration/config@master] jjb: Switch over to images using Quibble version 0.0.40

https://gerrit.wikimedia.org/r/562960

Change 562959 merged by jenkins-bot:
[integration/config@master] dockerfiles: Create images for Quibble version 0.0.40

https://gerrit.wikimedia.org/r/562959

Mentioned in SAL (#wikimedia-releng) [2020-01-08T20:56:33Z] <James_F> Docker: Publishing Quibble 0.0.40 images on contint1001 T192167 T220586 T236222 T236680

Change 562960 merged by jenkins-bot:
[integration/config@master] jjb: Switch over to images using Quibble version 0.0.40

https://gerrit.wikimedia.org/r/562960

Change 556405 merged by jenkins-bot:
[mediawiki/core@master] Add API-testing to core

https://gerrit.wikimedia.org/r/556405

This is breaking CI for extensions that intentionally interfere with the login flow:

If these tests are just for core, can they just run against mediawiki/core patches and not extensions?

If these tests are just for core, can they just run against mediawiki/core patches and not extensions?

Yes, but then they won't detect when extensions break application logic or expectations of the API.
Also, some extensions may want to specify API integration tests of their own.

If these tests are just for core, can they just run against mediawiki/core patches and not extensions?

Yes, but then they won't detect when extensions break application logic or expectations of the API.

So far all the failures are assumptions about permissions or login, which are fully compliant with the API and applications should be coded around that. Can those tests be disabled/fixed?

Also, some extensions may want to specify API integration tests of their own.

That can be done without running the core tests against extensions...

So far all the failures are assumptions about permissions or login, which are fully compliant with the API and applications should be coded around that. Can those tests be disabled/fixed?

API tests can be disabled per repo, AFAIK.

I don't see a way to fix the tests in a way that would allow them to function without the standard login mechanism.

Also, some extensions may want to specify API integration tests of their own.

That can be done without running the core tests against extensions...

True. But I think right now, it's all or nothing.

Can the tests be run with a bot password?

Can the tests be run with a bot password?

I don't think so. The tests create their own user accounts, with various levels of access. They don't run against a single account.

Or do you mean using a bot password for the "root" user that is used to create/promote the other accounts? That might be possible...

The number of extensions that are being V-1'd by CI are increasing...

The number of extensions that are being V-1'd by CI are increasing...

It'll get lots worse when the main lot of the tests lands; right now, just a single set (out of ~50) has been merged.

We should probably disable the integration tests for extensions for now.

But we should also make sure to have a look at why they fail, and whether that can be fixed. Not running these tests on extension repos isn't great.

@hashar Is there a mechanism that would allow for disabling tests for extensions but not core as @daniel suggests?

@hashar Is there a mechanism that would allow for disabling tests for extensions but not core as @daniel suggests?

No. (And I would oppose attempts to do such a thing; the whole point of CI is to see what tests from which repos are broken by which other repos.)

@hashar Is there a mechanism that would allow for disabling tests for extensions but not core as @daniel suggests?

No. (And I would oppose attempts to do such a thing; the whole point of CI is to see what tests from which repos are broken by which other repos.)

Yea, but how do you imagine this would work e.g. for an extension that replaces the login mechanism, or that changes the content model of the main namespace? The tests are written to work with a vanilla setup. Extensions by definition create a non-vanilla setup. It's useful to check that extensions don't break any assumptions, sure, but some things just can't be tested without making assumptions that go beyond the minimal guarantees.

@hashar Is there a mechanism that would allow for disabling tests for extensions but not core as @daniel suggests?

No. (And I would oppose attempts to do such a thing; the whole point of CI is to see what tests from which repos are broken by which other repos.)

Yea, but how do you imagine this would work e.g. for an extension that replaces the login mechanism, or that changes the content model of the main namespace? The tests are written to work with a vanilla setup. Extensions by definition create a non-vanilla setup. It's useful to check that extensions don't break any assumptions, sure, but some things just can't be tested without making assumptions that go beyond the minimal guarantees.

The normal way that significant new tests for core are introduced is to add them incrementally, with a note to wikitech-l that as tests are added, extensions might become broken, and instructions on how to fix them / who to ask for help. This process is what we did for the introduction of the selenium tests, and for running tests in PHP73, and it's what I'll do for PHP74 quite soon (once I've fixed the main WMF repos). This implicitly means that new tests need a way to be extend, altered, and disabled by code that makes changes to how MW operates that defeats any assumptions.

More fundamentally, what are the API tests meant to test, and for whom? If they only work in "vanilla" MediaWiki (which ships without a skin or an editor, amongst other things), whose development/maintenance are they serving? CI has grown out of the need to protect Wikimedia production, but the codebase there is ~200 extensions, most of which do exciting/worrying non-vanilla things (from replacing search to changing how user accounts work; from changing how user preferences work to altering the nature of protection levels; from providing new content types to replacing the DB back-end). Does this mean these tests will never actually report on Wikimedia breakages, because they'll all be disabled?

As a baseline at least, having extensions proactively declare which fundamental API tests they break at least gives us a way to tell exactly how "worrying" they are. Presumably the expectation would be that no extension would be enabled in production until/unless the set-of-tests-they-disable is zero.

It's not what I'd want in an ideal world, but the Parsoid "blacklist" mechanism (effectively way a way to disable tests by stating which tests are "allowed" to fail) has been useful as documentation in an imperfect world of (a) what doesn't work yet -- sometimes this is Parsoid's fault, sometimes it's a bad test, and (b) preventing regressions by implementing a ratchet: once a test is made to work it should continue to work. Getting 50% of core's tests running on your crazy extension and keeping that percentage from falling further is better than running 0% of core's API tests on your extension. IMO.

(One important principle that's worked for Parsoid is that "tests passing when the blacklist says they should fail" as well as "tests failing when the blacklist says they should pass" are treated as failures. That ensures the blacklist is kept up to date and the blacklist doesn't have any inertia towards growing larger. By forcing the blacklist to be precise it also ensures that any patches which change blacklist results (in either direction) have an update to the blacklist in the same patch, so it is easy for reviewers to see what's being fixed/broken in the patch.)

Also, some extensions may want to specify API integration tests of their own.

That can be done without running the core tests against extensions...

True. But I think right now, it's all or nothing.

I don't understand what you mean - we can easily just run tests against MediaWiki core patches only, just like we do for PHPUnit/QUnit/etc.


I think it should be possible to write API tests that take into account arbitrary MediaWiki extensions - pywikibot's smoke tests that used to run against the beta cluster handled it fine.

In any case, I think we should have two sets of API tests:

  • "structure test" that verify basic API functionality that must always work, e.g. action=help, action=paraminfo, action=query with no other parameters, etc. regardless of what extension is installed. This would be a rather small set of tests that run against both vanilla MediaWiki core and core plus arbitrary extensions.
  • "core tests" that do more extensive API testing, but run just against vanilla MediaWiki core instances.

Then there could be a third category of tests, which extensions can extend the core tests and run against core plus arbitrary extensions.

This is the same setup that we use for PHPUnit tests, and I think it balances both sides rather well.

So it seems we might need a little time to figure out the best strategy moving forward. Should we disable the one test in Core, for now, to not break the CI for others?

In any case, I think we should have two sets of API tests:

  • "structure test" that verify basic API functionality that must always work, e.g. action=help, action=paraminfo, action=query with no other parameters, etc. regardless of what extension is installed. This would be a rather small set of tests that run against both vanilla MediaWiki core and core plus arbitrary extensions.
  • "core tests" that do more extensive API testing, but run just against vanilla MediaWiki core instances.

Then there could be a third category of tests, which extensions can extend the core tests and run against core plus arbitrary extensions.

This approach makes sense to me. I only see one problem: the "structure tests" need a way to create user accounts and log in. But the login mechanism is not immune to extensions. We would need a "magical login for testing" mechanism that would give the tests a way to log in regardless of what login mechanism is configured.

Perhaps a combination of the bot password mechanism and knowledge of $wgSecret could provide this.

Change 564818 merged by jenkins-bot:
[mediawiki/core@master] Move Search and Watchlist tests from api-testing repo into Core

https://gerrit.wikimedia.org/r/564818

@Clarakosi and I have updated the API integration testing docs with a section on enabling tests in CI. Edits welcome!

thcipriani added a subscriber: thcipriani.

Looked at this task during our workboard triaging: can this task be closed?