Page MenuHomePhabricator

[Hypothesis] WE5.1.5
Closed, ResolvedPublic

Description

To improve DevEx in MW core through SonarCloud we should experiment with repositories with similar conditions so the results and learnings can
inform how we will proceed with MW core or global rules.

Parameters for Test Group Selection:

  • Repositories should have frequent activity, with at least a few commits per day, to facilitate faster iterations and improvements.
  • Teams maintaining these repositories should be accessible for feedback and suggestions.
  • The source code for test group projects should primarily consist of MediaWiki Core languages, namely PHP, JavaScript, TypeScript, and HTML.

Considering the above parameters, some of the repos we selected that make good test group candidates are listed below:

ProjectLanguageActive RulesDisabled Rules
CheckUserPHP64213
JavaScript199221
TypeScript192234
HTML2558
CampaignEventsPHP64213
JavaScript199221
TypeScript192234
HTML2558
GrowthExperimentsPHP64213
JavaScript199221
TypeScript192234
HTML2558

To prevent unintended impacts on non-experiment repositories and ensure a clear separation between the test and control groups, we have developed a SonarCloud experimental bot. This bot provides enhanced analysis reports to developers on Gerrit, which will be continually optimized based on developer feedback and suggestions. The SonarQube experimental bot is hosted on Toolforge, utilizing the Toolforge Build Service.

Currently, approximately 64% of our total production repositories utilize SonarCloud for code health feedback. Details are available in T321837.

Event Timeline

pwangai changed the task status from Open to In Progress.Aug 22 2024, 12:21 PM
pwangai added a project: SonarQube Bot.

So far, all I always see is the following:

image.png (279×549 px, 38 KB)

Maybe that just means that we write good code without bugs and code smells?
On the other hand, I think that particular change should record at least some code coverage on new code 🤔
But it is also still work-in-progress and one of the new tests for new code is not passing yet, maybe that's why?

@Michael It's possible the code currently has no bugs or code smells, but as we enable more rules, you may start seeing alerts.
Just yesterday I enabled around 127 recommended PHP rules that were previously disabled, so if any issues arise,
they will be flagged. JavaScript rules haven’t been modified yet, but I'm prioritizing enabling all recommended rules,
which could generate additional feedback.

Regarding coverage, SonarCloud relies on the coverage data produced by our CI. If the CI generates a format that
SonarCloud can’t fully ingest, it affects the displayed coverage, which is why some projects show higher coverage than others.
We can work on improving this over time, but I’ll also check for immediate improvements. A good example is
mediawiki/extensions/WikiLambda, which has strong coverage numbers—I’ll compare their setup to see if there is something
different they have done that we can adopt.

[...]
Regarding coverage, SonarCloud relies on the coverage data produced by our CI. If the CI generates a format that
SonarCloud can’t fully ingest, it affects the displayed coverage, which is why some projects show higher coverage than others.
We can work on improving this over time, but I’ll also check for immediate improvements. A good example is
mediawiki/extensions/WikiLambda, which has strong coverage numbers—I’ll compare their setup to see if there is something
different they have done that we can adopt.

Thank you for looking into that! I noticed that SpecialManageMentorsTest.php did not seem to create code coverage in sonar (it works in the mwext-phpunit-coverage-patch CI job). I wondered if this is maybe caused by its old way of annotating the coverage, but tests: simplify coverage annotation and improve types did not seem to improve the data in Sonar.

When glancing at the tests that do generate code coverage in Sonar, I notice that they are all unit-tests, that is, extending MediaWikiUnitTestCase and being located in the unit/ directory. Maybe that could be a promising line of investigation?

AFAIK this is because sonar cloud does not run integration tests when generating code coverage.

Thus was the case maybe a year ago when I tried to work this out for CheckUser. I think I was told at the time that this was because the integration tests were too slow.

AFAIK this is because sonar cloud does not run integration tests when generating code coverage.

Thus was the case maybe a year ago when I tried to work this out for CheckUser. I think I was told at the time that this was because the integration tests were too slow.

Good point! Looking at https://integration.wikimedia.org/ci/job/mwext-codehealth-patch/130349/console that seems to still be the case, at least for GrowthExperiments.

AFAIK this is because sonar cloud does not run integration tests when generating code coverage.

Thus was the case maybe a year ago when I tried to work this out for CheckUser. I think I was told at the time that this was because the integration tests were too slow.

Yes, this was by design, when first setting this up, and the two reasons were 1) speed and 2) sentiment that unit tests were a more accurate measure of coverage, because with integration tests it is easier to invoke unrelated code. I don't think anyone had a very strong feeling either way, so if someone wanted to make a proposal to widen this to include integration tests, it's probably possible. OTOH the very quick feedback with unit test coverage is nice to have.

I don't think anyone had a very strong feeling either way, so if someone wanted to make a proposal to widen this to include integration tests, it's probably possible.

I would like this. It was confusing for me to see 20% coverage through https://sonarcloud.io/project/activity?category=QUALITY_GATE&graph=coverage&id=mediawiki-extensions-CheckUser but then see 85% at https://doc.wikimedia.org/cover-extensions/CheckUser/.

While I get the argument about unit tests likely being more accurate, it is difficult to test some areas of the code with unit tests because they specifically interact with the database.

I don't think anyone had a very strong feeling either way, so if someone wanted to make a proposal to widen this to include integration tests, it's probably possible.

I would like this. It was confusing for me to see 20% coverage through https://sonarcloud.io/project/activity?category=QUALITY_GATE&graph=coverage&id=mediawiki-extensions-CheckUser but then see 85% at https://doc.wikimedia.org/cover-extensions/CheckUser/.

While I get the argument about unit tests likely being more accurate, it is difficult to test some areas of the code with unit tests because they specifically interact with the database.

One argument in favor of keeping the coverage report to unit tests is that integration tests can mask less-than-ideal code, e.g. you can use global variables and functions and still have a good integration test coverage score, while this would not be possible while looking at just unit test coverage. IMO as long as we have doc.wikimedia.org for publishing the integration test coverage, then it's fine to have the SonarCloud report contain just unit test coverage, so we can distinguish between the two.

Alternatively, we could rework the doc.wikimedia.org publishing process to contain two coverage reports, one for unit and one for unit + integration, and switch the SonarCloud publishing to include unit + integration.