Proposal for refactoring of backport dependency handling
Open, Needs TriagePublic
Actions

Assigned To

None

Authored By

	• jnuche
	Apr 19 2024, 3:26 PM

Description

Background

To goal of this task is to act as a central place to track the fixes and changes we want to add to how the backport code manages change dependencies. An important aspect will be to add documentation to the codebase that captures the design and terminology employed - the hope is that that documentation will serve as both a reference and a starting point for any planned changes in the future.

My intention here is to start a conversation with @dancy and @jeena to refine all the items mentioned in the task and agree on a way forward. New subtasks can then be created to tackle the different parts.

Terminology

Operator: User running scap backport from the command line
Change: A Gerrit change
Root change: A change directly targeted for backport by an operator, i.e. passed as an argument to scap backport from the command line
Backport repositories: The set of repositories for which it is possible to backport changes. This includes all the MediaWiki code repos (core, extensions and skins) and the MW configuration repo. In other words, the repos checked out in the deployment server under /srv/mediawiki-staging and their respective submodules.
Production branch: The master branch of the MW configuration repo or any MW version in the wikiversions.json file in the deployment server. Restricted to backport repos
Production change: A change that targets a production branch
Deployable branch: A branch currently checked out in the deployment server. Restricted to backport repos. Production branches are a subset
Deployable change: A change that targets a deployable branch. Production changes are a subset
Dependency: A change targeted directly or transitively by another change's Depends-On clause
Sibling dependencies: The set of changes that share the Change-Id specified by a Depends-On clause
Relevant dependency: A dependency that has an effect on whether a root change can be backported or not
Dependency trail: The graph/tree of all relevant dependencies of a root change

Discrete issues

Allowed root changes

Only deployable changes can be passed as root changes. That is, changes from a backport repository that target a branch currently checked out in the deployment server. In any other case scap should bail out

Non-backport repositories

Changes from non-backport repos cannot be backported. When:

Passed as a root change: scap should fail and abort the operation
Present as a dependency: scap should emit a confirmation warning prompt with a clear explanation of the risk involved and that scap won't handle the dependency

Relevant dependencies

NOTE: See T365146 for a refined version of the criteria below

~~Given a root change or relevant dependency D targeting branch br, one of its dependencies K as specified by Depends-On is also relevant in the following cases.~~

# D and K belong to a MW code repo --> K will be relevant if it targets br, or if it targets master but there is no sibling dependency targeting br. This is important to prevent T345304
~~# K belongs to the MW configuration repo --> K will be relevant if it targets branch master, irrespective of what repo D belongs to~~
# D belongs to the MW configuration repo and K belongs to a MW code repo --> K will be relevant if it's a deployable change, or if it targets master but there are no siblings targeting deployable branches. Again, important to prevent T345304

Applying recursively the criteria ~~above~~ in T365146 produces the dependency trail for a root change. Note that a transition is created when a relevant dependency hits the MW configuration repository. At that point we lose the information about what deployable branch we are tracking and the criterion for what constitues a relevant dependency becomes less strict.

If a change in the dependency trail specifies a Depends-On clause but no relevant dependencies can be determined for it, scap should prompt with a confirmation warning to let the operator know about the anomalous situation.

Backport vote

A relevant dependency votes for or against backporting its root change. If a single relevant dependency in the dependency trail votes against, then the root change cannot be backported.

Again, given a root change or relevant dependency Change targeting branch Br, its relevant dependency Dep will vote in the following ways.

Change and Dep belong to a MW code repo --> If Dep doesn't target master it will vote in favor if, and only if, it's been merged or passed as root change in the command line. If it targets master it will vote in favor if, and only if, it's been merged and its commit is in Br (as determined by the "change/_/in" endpoint from Gerrit)
Dep belongs to the MW configuration repo --> Dep will vote in favor if, and only if, it's been merged or passed as root change in the command line
Change belongs to the MW configuration repo and Dep belongs to a MW code repo --> If Dep doesn't target master it will vote in favor if, and only if, it's been merged or passed as root change in the command line. If it targets master it will vote in favor if, and only if, it's been merged and its commit is in all deployable branches (as determined by the "change/_/in" endpoint from Gerrit)

Documentation

All of the above needs to be captured and properly explained in documentation living next to the backport code in the scap repository.

Open questions

How much do we want to change/expand our current battery of backport tests?
What's the best way to document the design in the code? Is a text README with ASCII graphs enough? Or are there any other interesting tools we can leverage here?

Related Objects
Search...

Status	Assigned	Task
Open	None	T362987 Proposal for refactoring of backport dependency handling
Open	jeena	T365146 Backport: implement new criteria for relevant dependencies
Open	jeena	T371611 Introduce new workflow for relevant dependencies calculation

Event Timeline

• jnuche created this task.Apr 19 2024, 3:26 PM

Restricted Application added a subscriber: Aklapper. · View Herald TranscriptApr 19 2024, 3:26 PM

• jnuche updated the task description. (Show Details)Apr 19 2024, 3:37 PM

dancy updated the task description. (Show Details)Apr 23 2024, 3:01 PM

@jnuche Thank you for this beautifully written document!

Suggestions/comments:

Please define sibling dependency in the Terminology section.
Codewise, I want to have a single function that takes root change numbers as input and generates a machine-readable validation report. This can be used as the main entrypoint for testing, and will have future value for an alternate scap UI.
Testing: Must be tested using the regular test script. This will get tested in CI and will be a lot faster than test-scap-backport.

I'm not clear on what the proposed code changes are. I think what is mentioned in the "relevant dependencies" and "backport vote" sections are already implemented in the code (or in the case of change_in, have a MR to do so), except for maybe reducing the number of confirmation prompts. Can you clarify which parts are missing or whether it is that you would like to change the structure of the dependency checking process?

• jnuche updated the task description. (Show Details)Apr 24 2024, 11:25 AM

@jnuche Thank you for this beautifully written document!

Thanks!

Please define sibling dependency in the Terminology section.

Done

Codewise, I want to have a single function that takes root change numbers as input and generates a machine-readable validation report. This can be used as the main entrypoint for testing, and will have future value for an alternate scap UI.

That makes sense to me

Testing: Must be tested using the regular test script. This will get tested in CI and will be a lot faster than test-scap-backport.

When you say test-scap-backport you mean the integration tests that currently run in train-dev, right? Having tests that target specifically the backport dependency validation and that are part of the regular tests/CI is also something that I would like to have. And personally think it's worth investing the time in doing it

I'm not clear on what the proposed code changes are. I think what is mentioned in the "relevant dependencies" and "backport vote" sections are already implemented in the code (or in the case of change_in, have a MR to do so), except for maybe reducing the number of confirmation prompts. Can you clarify which parts are missing or whether it is that you would like to change the structure of the dependency checking process?

As a first step I want to make sure that we all agree on terminology and a set of acceptance criteria (do all my points above look correct? did I miss something?)

Then we can look at the current state of the code (including the MR you mentioned) and discuss any required code changes. If we verify that we are already checking the agreed upon A/C (and only those A/C) then that aspect of the changes is complete and doesn't require extra work :)

The other piece of work regarding the code would be a refactor to make the code match the agreed upon terminology. Quite often when we talk about backport issues it's hard to tell what we are referring to and we have to start remembering how things work from scratch. I think that having a design document and a code that matches it would really help us in the future to have a reference and a starting point for discussions.

In T362987#9739951, @jnuche wrote:

Testing: Must be tested using the regular test script. This will get tested in CI and will be a lot faster than test-scap-backport.

When you say test-scap-backport you mean the integration tests that currently run in train-dev, right?

Yes.

Having tests that target specifically the backport dependency validation and that are part of the regular tests/CI is also something that I would like to have. And personally think it's worth investing the time in doing it

Agreed!

• jnuche updated the task description. (Show Details)Apr 24 2024, 3:23 PM

In T362987#9740026, @jnuche wrote:

I'm not clear on what the proposed code changes are. I think what is mentioned in the "relevant dependencies" and "backport vote" sections are already implemented in the code (or in the case of change_in, have a MR to do so), except for maybe reducing the number of confirmation prompts. Can you clarify which parts are missing or whether it is that you would like to change the structure of the dependency checking process?

As a first step I want to make sure that we all agree on terminology and a set of acceptance criteria (do all my points above look correct? did I miss something?)

Then we can look at the current state of the code (including the MR you mentioned) and discuss any required code changes. If we verify that we are already checking the agreed upon A/C (and only those A/C) then that aspect of the changes is complete and doesn't require extra work :)

The other piece of work regarding the code would be a refactor to make the code match the agreed upon terminology. Quite often when we talk about backport issues it's hard to tell what we are referring to and we have to start remembering how things work from scratch. I think that having a design document and a code that matches it would really help us in the future to have a reference and a starting point for discussions.

Thanks for the clarification! That sounds good to me.

• jnuche mentioned this in T365146: Backport: implement new criteria for relevant dependencies .May 16 2024, 1:27 PM

I'm wondering if we want to change the terminology of 'Production branch' to 'Active branch' to match the terminology that is all over scap (active wikiversions, etc.)?

jhuneidi updated https://gitlab.wikimedia.org/repos/releng/scap/-/merge_requests/270

backport.py: Restore included_in branches check

• jnuche updated the task description. (Show Details)May 17 2024, 12:09 PM

I'm wondering if we want to change the terminology of 'Production branch' to 'Active branch' to match the terminology that is all over scap (active wikiversions, etc.)?

Personally I like that we can talk of deployable branch/change and production branch/change. It has a symmetry that helps me grasp intuitively the relationships between those things:

- Production branch: The master branch of the MW configuration repo or any MW version in the wikiversions.json file in the deployment server. Restricted to backport repos
- Production change: A change that targets a production branch
- Deployable branch: A branch currently checked out in the deployment server. Restricted to backport repos. Production branches are a subset
- Deployable change: A change that targets a deployable branch. Production changes are a subset

"Active branch" makes sense but "Active change" is not a term we normally use and it's a bit ambiguous. The other option is to use "Active branch" and "Production change", but that breaks the connection between those two terms. Soooo, dunno.

jhuneidi opened https://gitlab.wikimedia.org/repos/releng/train-dev/-/merge_requests/71

Allow traindev user to delete branches

dancy merged https://gitlab.wikimedia.org/repos/releng/train-dev/-/merge_requests/71

Allow traindev user to delete branches

jhuneidi merged https://gitlab.wikimedia.org/repos/releng/scap/-/merge_requests/270

backport.py: Restore included_in branches check

jeena mentioned this in rMSCA509ed9abb949: backport.py: Restore included_in branches check.Jun 4 2024, 11:00 PM

D belongs to the MW configuration repo and K belongs to a MW code repo --> If K doesn't target master it will vote in favor if, and only if, it's been merged or passed as root change in the command line. If it targets master it will vote in favor if, and only if, it's been merged and its commit is in all deployable branches (as determined by the "change/_/in" endpoint from Gerrit)

Why should the commit be in ALL deployable branches? I think it only needs to be in one deployable branch. A backport targeting a newer version of wmf wouldn't necessarily exist in a previous verison, right?

A backport targeting a newer version of wmf wouldn't necessarily exist in a previous verison, right?

In a situation where the backport root is targeting a wmf version, then you're absolutely right. But in the scenario you're quoting D belongs to a config repo (so we know it's targeting master) and K is also targeting master (but it's a wmf code repo). At that point we don't know which of the wmf version branches are required to contain the commit in K.

For example, imagine that instead we decided that the requirement is the commit from K should be just in the production branches (not requiring other deployable branches to contain it). Then in some cases we would be reintroducing this bug: T317795. That is, we deploy D and since the required commit from K is in all production branches, everything seems to work well at first, but maybe the commit is missing from one of the other deployable branches. Then you try to roll forward/back to that deployable branch and things go kaboom.

A different situation would be if we had a dependency trail that looked like something like:

Root change is D and belongs to a MW repo (targeting a wmf version, let's call it branch V123) --> K1 belongs to a config repo (hence targeting master) --> K2 belongs to a MW repo (targeting master)

In that case we could require that K2 is merged and its commit present in V123 (but not required in any other deployable branches). But to implement this we would need to remember V123 and pass it along the dependency trail, which is something the currently proposed solution doesn't do. Since I thought that a case like this is pretty unlikely, I decided against adding it to try to prevent the acceptance criteria and ~~the implementation~~ from getting even more complicated (actually forget about the implementation point, it would be trivial to get the branch targeted by a root change in a trail). But this is a matter of balance between simplicity of the solution, convenience for the users and the likelihood of errors (for example, if it turned out that the commit from K2 was required to be in the other branches, did the developer set up their dependencies wrong? or is it a bug that scap didn't detect the situation? it looks to me like it's the former in this case, but seems debatable)

Note however, that if the root change D belonged to a config repo, we are back to the first situation where we need to consider all deployable branches. So that's a scenario we need to handle anyway.

At that point we don't know which of the wmf version branches are required to contain the commit in K.

In the scenario where wmf.1 is deployed and K is only included in wmf.2 this requirement would cause the backport for D to fail. I don't think that is the correct behavior. The configuration change D could have no relevance to wmf.1 .

For example, imagine that instead we decided that the requirement is the commit from K should be just in the production branches (not requiring other deployable branches to contain it). Then in some cases we would be reintroducing this bug: T317795. That is, we deploy D and since the required commit from K is in all production branches, everything seems to work well at first, but maybe the commit is missing from one of the other deployable branches. Then you try to roll forward/back to that deployable branch and things go kaboom.

I didn't mention production branches, but I don't understand what you mean by this. It looks like the problem for T317795 is that scap backport refused to backport because the change wasn't in a deployable branch, when the desired outcome was for the backport to continue, meaning there should be no restriction on K being in all deployable branches.

Since it's possible that a change to configuration D doesn't require the depends-on change K to be in every deployed branch, I don't see how that should differ when the change K had originally been merged into master and then branched into a wmf branch.

The configuration change D could have no relevance to wmf.1

But then again, it could. That's the point, that we don't know because the developer didn't give us that information in the dependency chain that they set up. If D did indeed require the commit from K to be in wmf.1 and we allowed the backport to proceed, things would break.

Originally what I was thinking in that situation (when not all deployable branches have the commit from K) is to inform the operator of the situation and let them know they need to create changes for the branches that don't contain the commit. But thinking about it again that feels like a pretty bad user experience 😅 What we could do instead is to inform the operator of which deployable branches are missing the commit from K and give them the responsibility of deciding whether to go ahead with the backport (the operator is the one who in theory knows which branches would actually need the change)

Before I went on vacation we talked about having a call to continue chatting about this. I'll try to set it up, I think we'll get better results analyzing this scenario together.

• jnuche updated the task description. (Show Details)Jul 16 2024, 2:16 PM

• jnuche updated the task description. (Show Details)Jul 16 2024, 2:19 PM

• jnuche updated the task description. (Show Details)Aug 1 2024, 1:00 PM

• jnuche mentioned this in T371611: Introduce new workflow for relevant dependencies calculation.Aug 1 2024, 2:02 PM