Should we invalidate jenkins builds after a certain time period?

In Katello, we recently had our master branch failing in jenkins, due to failures in eslint and react snapshot tests.

This, of course, is inconvenient as it blocks all other PRs from passing jenkins and being merged. Theoretically, if these tests are being run on all PRs, the tests should never be broken on master.

The issue is we have PRs who have had tests ran a while ago and passed, but since then, master has changed quite a bit. The tests will fail once the PR is merged and rebased.

I have an example of this PR, which introduced some unused import eslint errors. From what I can tell on the PR, the tests ran on August 9th (and presumably passed), but it was merged yesterday. In this time period, master had changed quite a bit and once it was rebased into master, some tests were failing. (Iā€™m in no way pointing fingers at anyone involved in the PR, everyone used the correct workflow for our current process, it just happens to be a good example)

We had a similar breakage a few weeks ago, they seem to happen in our React code since its changing so fast (which is a good thing!)

This can be easily avoided if we invalidate jenkins builds or require them to be run again after a certain time period. This would ensure that the PR ran with the latest master branch and was merged shortly after. My preference would be a week, but that is something we can discuss and/or vote on if we decide to go this route.

I realize something could change within the allotted time period even if its small, but I think we can prevent the vast majority of these cases even with a generous time of keeping jenkins builds valid.

My initial questions are:

  • Is invalidating a jenkins build or similar something we would like to consider?
  • Do we think it would help keep master from breaking?
  • Is this possible technically within jenkins and/or github?

I can create polls and such, but would like to get some initial feedback first from those who are more familiar with our test automation.

Looking forward to hearing your opinions, thanks!

1 Like

Invalidating Jenkins builds after a week (or other time) seems like a good solution, though Iā€™m unsure how to implement that, whether GitHub has a mechanism or Jenkins or prprocessor :man_shrugging:

Another solution could be a process change on our part to request users to rebase their PRs if the build is ā€˜too oldā€™. Tooling to support this could perhaps be a comment added using prprocessor after a week of inactivity: ā€œPlease rebase this PRā€, again Iā€™m unsure prprocessor can do this, as it runs on each update to the PR.

Out of the box, I canā€™t think of a mechanism that supports this. There are things we could do with prprocessor for example to analyze open PRs and comment on them if htey are old. Possibly even have it comment a [test ] comment to force re-testing.

I would encourage maintainers to be mindful of activity on a PR when reviewing and if you feel itā€™s stale to add a [test ] comment to re-test things before merging.

1 Like

@ehelms good idea! A comment from the pr processor like ā€œIts been over one week since tests passed on this PR, please re-run tests before merging to avoid introducing failures.ā€ would be really helpful to remind the reviewer/author to do so. This would be a simple solution to this issue. Hopefully there is a way to say 'run on PRs that have had passing builds longer than a week".

+1 this seems to be the only thing we can push towards today without adding ā€˜PR Scanningā€™ ability in prprocessor (currently only scans a PR when the PR updates).

It would be nice to have a comment reminding us to re-run jenkins or invalidating jenkins builds after a certain time, but I also donā€™t see a way to do either easily as others have mentioned

Perhaps we can revisit this at another time, but for now its seems up to the maintainers to re-run jenkins on PRs with ā€œstaleā€ jenkins builds before merging.

Thanks for discussing all!

We have a scanner in our PR processor to close inactive:

You could apply something similar for something thatā€™s been open for > 1 week without activity and add a build failure. When you rebase all build statuses are cleared so that should be the preferred way to update it. Using [test ] is IMHO worse because other commits can invalidate your results with the resulting merged commit still breaking (like the lint rules).

Thanks for sharing that.

What we would want to check for is not ā€œno activity for a weekā€, but rather ā€œit has been a week since jenkins has been run and passingā€. Is there a way to check for that?