Should we invalidate jenkins builds after a certain time period?

John_Mitsch · September 20, 2018, 3:10pm

In Katello, we recently had our master branch failing in jenkins, due to failures in eslint and react snapshot tests.

This, of course, is inconvenient as it blocks all other PRs from passing jenkins and being merged. Theoretically, if these tests are being run on all PRs, the tests should never be broken on master.

The issue is we have PRs who have had tests ran a while ago and passed, but since then, master has changed quite a bit. The tests will fail once the PR is merged and rebased.

I have an example of this PR, which introduced some unused import eslint errors. From what I can tell on the PR, the tests ran on August 9th (and presumably passed), but it was merged yesterday. In this time period, master had changed quite a bit and once it was rebased into master, some tests were failing. (I’m in no way pointing fingers at anyone involved in the PR, everyone used the correct workflow for our current process, it just happens to be a good example)

We had a similar breakage a few weeks ago, they seem to happen in our React code since its changing so fast (which is a good thing!)

This can be easily avoided if we invalidate jenkins builds or require them to be run again after a certain time period. This would ensure that the PR ran with the latest master branch and was merged shortly after. My preference would be a week, but that is something we can discuss and/or vote on if we decide to go this route.

I realize something could change within the allotted time period even if its small, but I think we can prevent the vast majority of these cases even with a generous time of keeping jenkins builds valid.

My initial questions are:

Is invalidating a jenkins build or similar something we would like to consider?
Do we think it would help keep master from breaking?
Is this possible technically within jenkins and/or github?

I can create polls and such, but would like to get some initial feedback first from those who are more familiar with our test automation.

Looking forward to hearing your opinions, thanks!

akofink · September 20, 2018, 3:18pm

Invalidating Jenkins builds after a week (or other time) seems like a good solution, though I’m unsure how to implement that, whether GitHub has a mechanism or Jenkins or prprocessor

Another solution could be a process change on our part to request users to rebase their PRs if the build is ‘too old’. Tooling to support this could perhaps be a comment added using prprocessor after a week of inactivity: “Please rebase this PR”, again I’m unsure prprocessor can do this, as it runs on each update to the PR.

ehelms · September 20, 2018, 3:32pm

Out of the box, I can’t think of a mechanism that supports this. There are things we could do with prprocessor for example to analyze open PRs and comment on them if htey are old. Possibly even have it comment a [test ] comment to force re-testing.

I would encourage maintainers to be mindful of activity on a PR when reviewing and if you feel it’s stale to add a [test ] comment to re-test things before merging.

John_Mitsch · September 20, 2018, 4:06pm

@ehelms good idea! A comment from the pr processor like “Its been over one week since tests passed on this PR, please re-run tests before merging to avoid introducing failures.” would be really helpful to remind the reviewer/author to do so. This would be a simple solution to this issue. Hopefully there is a way to say 'run on PRs that have had passing builds longer than a week".

akofink · September 24, 2018, 1:39pm

+1 this seems to be the only thing we can push towards today without adding ‘PR Scanning’ ability in prprocessor (currently only scans a PR when the PR updates).

John_Mitsch · September 24, 2018, 1:57pm

It would be nice to have a comment reminding us to re-run jenkins or invalidating jenkins builds after a certain time, but I also don’t see a way to do either easily as others have mentioned

Perhaps we can revisit this at another time, but for now its seems up to the maintainers to re-run jenkins on PRs with “stale” jenkins builds before merging.

Thanks for discussing all!

ekohl · September 27, 2018, 1:38pm

We have a scanner in our PR processor to close inactive:

github.com

theforeman/prprocessor/blob/master/scripts/close_inactive.rb

#!/usr/bin/env ruby
require 'raven'
require 'octokit'
require 'pp'
require 'date'
require 'yaml'
require File.expand_path(File.join('..', '..', 'repository'), __FILE__)
require File.expand_path(File.join('..', '..', 'github', 'pull_request'), __FILE__)
require File.expand_path(File.join('..', '..', 'redmine', 'issue'), __FILE__)

def close_prs(client, repo, config, label, time, message)
  query = "repo:#{repo} type:pr state:open label:\"#{label}\" updated:\"<#{time}\""
  result = client.search_issues(query, :per_page => CONFIG[:max_closed], :sort => 'updated_at', :order => 'asc')
  return if result[:total_count] == 0

  puts "Pull requests older than #{time}: #{result[:total_count]}"
  result[:items].each do |pr|
    title = pr[:title]
    number = pr[:number]
    user = pr[:user][:login]

This file has been truncated. show original

You could apply something similar for something that’s been open for > 1 week without activity and add a build failure. When you rebase all build statuses are cleared so that should be the preferred way to update it. Using [test ] is IMHO worse because other commits can invalidate your results with the resulting merged commit still breaking (like the lint rules).

John_Mitsch · September 27, 2018, 2:00pm

Thanks for sharing that.

What we would want to check for is not “no activity for a week”, but rather “it has been a week since jenkins has been run and passing”. Is there a way to check for that?