Process for dealing with cherry picks

ekohl · February 17, 2021, 11:23am

This RFC is less about the exact code and more about the process. It is about how we make sure PRs are cherry picked into stable branches. In recent times we’ve relied on @tbrisker to take care of this. This does not scale and we don’t want to depend on a single person for something quite crucial. Here is my initial proposal to make it a process.

The problem I’m trying to solve is that for every fix merged, we should consider if it needs to be a cherry pick. There are 2 ways to drive this.

Github

We can have a Github driven system that uses labels. We would apply Needs cherrypick on any PR that should be considered for a cherry pick. Then it should regularly be handled. That can be to either remove the label in case it shouldn’t be picked. If it should be picked, the PRs to branches should be opened and the label be removed.

There could be automation to apply this label automatically. For example, our bot can automatically add the label when a PR is merged and has the type Bug in Redmine. The reasoning would be that we don’t cherry pick features or refactors unless there’s a good reason. In that case it can still be added manually.

This does have a risk that we miss updates. For example, sometimes a Redmine issue has multiple commits. If one commit is missed, the fix may be incomplete. That brings me at the other point

Redmine

Katello has been using a Redmine based cherry pick flow. While I don’t know the exact details, there’s instructions.

I would propose to use the Redmine releases. To mark something for a cherry pick, the correct Fixed in version should be set. This allows the tools to correctly track it.

There should be a regular moment when the cherry pick PRs should be opened. It would then become a regular PR review, just to the stable branch instead of develop.

Here automation could run and create PRs. We can decide to bundle up all commits into one PR or one PR per Redmine issue. I’m leaning to the latter since it makes it easier to accept or reject each pick but it does increase the chance of conflicts. Conflicts is also something that should be dealt with.

Questions

Do people think this this makes sense?
Do we want one, the other, both or none?
What am I forgetting?

tbrisker · February 17, 2021, 12:22pm

Here, here! It’s quite hard to keep track of everything that goes in to Foreman when I’m online, not to mention when I’m away.
Right now in most cases I’m the only person merging (or cherry-picking directly) into stable branches, and there are 4 ways that I notice things that need cherry-picks:

Someone opens a cherry-pick PR to a stable branch.
Someone sets the “Target version” field in redmine (see below).
Someone mentions me in a comment on GH and asks for cherry-pick.
I notice during reviews a PR that I believe makes sense to cherry-pick.

for cases 2-4 in most cases I perform the cherry-pick directly to the stable branch instead of opening a PR, unless I feel there is a need to have CI tests running on it again in the stable branch.

We actually have a similar process (sometimes) used in foreman, only using the “Target version” field. In fact, the reason we added the “Fixed in” field was to allow us to separate between “What should go into release X” and “What is the code already in release X”.
Currently, prior to every release we (well, mostly I) run a query on redmine to look for issues that are marked with “Target version” of the current release but don’t have it in “Fixed in” - i.e. things that need cherry-picks (or potential release blockers in case they are still open).

The main issue with the Redmine approach is that someone (contributor, reviewer, some other developer) needs to notice a certain change is worth cherry-picking and manually change the field in redmine. Adding a label manually on GH still requires an action but it’s probably easier to notice since you’re already there when clicking the merge button.

The majority of issues don’t get cherry-picked, so I would rather not auto-add a label to many prs and then require manual action to remove it (or we would start ignoring it because of too many false-positives). Perhaps we could have a bot ask in a comment which version this should go into and update the target version according to another comment (or a field in the PR template)?

Another suggestion that was discussed at the core maintainers triage meeting a few weeks ago was to dedicate about 5 minutes of that meeting each week to quickly go over the PRs that were merged during the past week and see if there are any that should be cherry-picked.

One more option we could consider is doing something similar to how the linux kernel does stable branches - there is a separate maintainer who is responsible for it (Greg Kroah-Hartman in their case) and makes the call on what goes into -stable. We could have one or more of the maintainers whose responsibility will include monitoring the incoming changes and pulling what is needed back (and perhaps also decide when it makes sense to do a .z release). Katello does have a bit of a similar solution where each release has a different owner who continues to “own” the stable branch until it goes EOL. In the past we tried doing some rotation in Foreman as well, but the complexity of learning the full release process while doing it for the first time led to a lot of issues and delays (which was why I started managing the releases over two years ago, along with breaking up the process between two or more people). However, if we assign an owner who is only responsible for monitoring and performing cherry-picks to the stable branch, perhaps it could work.

I would definitely prefer one for each issue, otherwise if one issue breaks something it can be very difficult to identify which one it is and exclude it from the batch.

ekohl · February 17, 2021, 12:43pm

My idea was to combine both: the bot would have a mechanism to add a label. Then in the triage meeting we go over all items with the label.

What exactly the mechanism would be is up for debate. I worry that if you use comments to add a label that would greatly increase the amount of notifications. That too is a huge burden while removing a label is quite cheap.

There are other things to consider if you want to be smart. You already mentioned the Linux kernel. They use Fixes: COMMIT in the commit message. That signifies it is a candidate for a cherry pick. You also know when COMMIT was introduced and if it needs a cherry pick into the stable branches.

However, given how poor our current commit messages are (they are often really bad) I don’t expect people to take the effort. That would make it an unreliable mechanism.

Another thing that comes to mind is using the Affects version field in Redmine. It’s similar to Fixes, but less precise. It has the benefit of being able to modify it later on.

In short: how many false positives do we accept for a false negative.

My proposal to add a label to each merged PR that is attached to a Redmine bug will have a decent number of false positives, but removing a label is cheap. In the pull request overview you can bulk remove labels.

And I didn’t state this explicitly, but I strongly believe both Foreman and Katello should adopt the same process. It doesn’t need to be executed by the same people, but if we can share a process we can also share code. It’s also clearer for contributors to multiple projects how things work.

ehelms · February 17, 2021, 1:18pm

A nuance of the Github proposal I am missing is how you know where a fix should be cherry picked to? Given we have multiple active releases supported at a time. Whereas Redmine provides the ability to target a version for a fix. In the Katello process, the tool_belt tooling allows setting prior releases in the configuration. This means that if a cherry pick goes into 3.18.2 and then someone is working on 3.19.1 all cherry picks that went into 3.18.2 will be identified to ensure they also end up in 3.19.1 for continuity.

My first thought is always that if Redmine is our source of truth for issues, that it should be the driver for all the processes. If we want to use a Github based process, we should consider switching to Github issue management and unify on there.

ekohl · February 17, 2021, 1:40pm

I never understood the logic behind the prior release. My thinking is always “does it affect x.y”. So if the bug affects 2.3, it needs to be picked into 2.3-stable. Then when we decide to do a 2.3.z, it’ll be released. If it affects 2.3, then chances are it also affects 2.4.

Generally speaking I try to figure out what the root cause is. This typically allows you to find a specific commit that introduced it. It’s trivial to find out which release that is (git tag ---contains COMMIT). Usually I already do this while writing a patch but I see few others do this which means the burden lands on the stable release maintainer(s).

Once you know where it was introduced, you can decide the cherry pick strategy. If it only affects nightly, it’s trivial since no pick is needed. For existing releases, you can determine the impact. Upgrade blockers generally need a cherry pick. For other bugs it can be sufficient to fix it in one stable branch but not the other due to impact.

Here you do a risk analysis. If the process becomes openening PRs automatically via some logic, it becomes a regular review. During reviews you always look at the impact of a patch, the criteria just become a bit stricter. We can set permissions on stable branches that a stable release maintainer needs to approve it.

Marek_Hulan · February 24, 2021, 10:20am

Isn’t this resolved by maintainers going through all merges PRs on a weekly basis and checking which should go to the stable branch? Or are you rather trying to solve, how to record such decisions and the progress? If so, are you trying to address it from the maintainer perspective or contributor perspective? As a maintainer I can set labels as a contributor I can’t. In Redmine, more people can modify fields, but if I’m not mistaken, only those with a Developer role.

For the reply below, I’m going to assume we’re trying to record decisions from the maintainers call and drive the process.

I guess at this point, we also need to add another label saying PRs for that were opened.

Is it worth the effort? There’s few PRs like that a week. Especially if it wouldn’t be reliable and if it’s only for people who are clicking through github on the call, I think it’s fine for them to just add a label when needed.

Do we need that information in redmine before the patch is merged somewhere? I’d prefer to use github and only automate the fixed in version once the patch lands in a given branch. We already set the latest version after patch is merged to develop. If a github label would indicate the patch needs to get to stable branch, we merge it to e.g. 2.3-stable, we’d just added the 2.3.latest to the redmine issue after it gets to the stable branch. I’d prefer to use github for code related processes, redmine for reporting related processes (if we need to have itm that’s a separate topic though)

I think we need to go over all PRs that were merged. Some people don’t have a way to mark their PR for the stable branch (both github and redmine options)

Yes, no to comments

I think we’re already in the point redmine is not the only source of truth. Some projects use github issues only. I think it’s time for this debate, just not in this thread.

Dirk · February 24, 2021, 2:44pm

I think I only have normal user privileges, so I can not add a new value to the field, but can set it to all existing values. If this answers the question.

Marek_Hulan · February 24, 2021, 8:01pm

Interesting, thanks for checking! Looking at the project setting, you really don’t have the Developer role.

ekohl · May 27, 2021, 1:52pm

Just to close this out, I’ll summarize what is now happening in practice.

Every week for Foreman core there’s the maintainers meeting. They go over all merged PRs since the last meeting and decide for every PR if it needs the cherry pick.

It’s also possible to explicitly nominate a cherry pick by adding the Needs cherrypick. It’s not exactly formal, but it does help if you add which versions it should be picked to. During the same weekly maintainers meeting this gets reviewed and picked.

It should be noted that that only covers core. There’s also a weekly release team meeting which also makes sure that issues are picked. This is more Redmine driven.

If anyone not part of these teams wants to make sure a fix gets into a (supported) release, it can be added to the release team agenda:

If there’s additional concerns, we can always revise the plan but I think we have a decent enough process in place to make sure cherry picks happen and don’t rely on a single person. With that, I consider this RFC as resolved.

Thanks everyone involved!