Sending Jenkins build failures to Discourse

Let me more elaborate this - if I break core for everyone, this is priority. Therefore I believe I need to have a message in my INBOX. That’s the first folder I read every morning. Things are getting done when I am sucking my first coffee.

In my workflow, I go to Red Hat important mailing lists and then move on to discourse and then to github. There can be notification waiting for me until evening, it will likely end up someone pinging me before lunch. End of story :slight_smile:

I’m hearing a lot of “I don’t personally like this” (which is totally fine, and good to know), but I think we need to try more ideas and evolve how we do things. Unless you believe that this trial would actively decrease the amount of work on the Jenkins builds, then I think we should try it. On a personal level, you can always mute the category/tag (or write a filter, if you use mailing list mode).

With that in mind, I did promise a poll, so here it is

  • Yes - let’s try it
  • Indifferent - I don’t personally like it
  • No - I think this will harm the builds

0 voters

I voted No because for now I think we should first work on this with a smaller group to get stable nightly builds. There are still some things that I think are false positives or other things we should fix before exposing it to the entire community. We should make sure people can actually have an impact on fixing the problem because otherwise it’s spam to them and they’ll start to ignore it. By the time we make it usable, it’s already on the ignore list and we miss our goal.

Longer term I do think it can be a good idea.

3 Likes

I am neurtal, slightly against this. Whatever we do, let’s make sure it won’t harm browsing/reading experience, search (!) and also Google (I say it but I mean web search engine crawlers). We don’t be loosing rank there. Perhaps disallow search engines on these topics, don’t know.

I think @ekohl makes an excellent point - yes, but not right now. Given the mixed result of the poll, I’ll come back to this in a few months, after we’ve seen what impact the recent changes can have on the build failures :slight_smile:

Based on some discussions amongst the various developers working on releases, I am reviving this idea and proposing it as an experiment with a re-assessment duration and job limitations. The same proposed workflow and rules would apply for where failures are posted to and how developers interact with the posts to communicate breakages and pending fixes. Builds are more stable than they were and we believe this workflow will help keep them that way while exposing to a broader audience when there is a failure, types of failures (in case we need to take systematic tactics) and the resolutions.

  • Try it out for 2 months
  • Include only the following jobs:
    • foreman-nightly-release
    • foreman-plugins-release
    • katello-nightly-release

Other jobs may be proposed or brought on board if they are in a similar critical path as the ones listed above. I’m adding another poll to see what folks think since the last one.

  • Yes
  • No
  • Indifferent

0 voters

I voted no, let me explain. I am against only testing katello and foreman, I want all (or top 10) plugins to be in the pilot as well. Beacause today every time I ask for jenkins configuration change the answer is “we would need to do this for ALL plugins, it is not possible”. There are some technical limitations and plugin jobs are not on par with core and katello and that is not fair.

Here is my concern: If this turns out to be good experience, I can imagine I will not be allowed to enable discovery because “that would create too much noise”. Lets test this properly with everything, lets solve all challenges during the pilot (like deleting old posts or something like that).

I’m not against the inclusion of plugins assuming in part that plugin maintainers are up for watching the topic and responding. I would like in part to point out that the current focus is on release jobs with no bias towards any particular core or otherwise. At present, Foreman, plugins and Katello are the only projects that have release jobs (and built nightly).

There are some plans in the works to add more projects being built on a nightly basis. And further, if we can design an individual plugin nightly release design then we can begin to incorporate those into this.

I voted yes because we have a clear place to do root cause analysis.

This one is still very unstable. About half of the runs something fails because repoclosure is ran in parallel which it doesn’t properly support (https://bugzilla.redhat.com/show_bug.cgi?id=1593331).

Oh I missed these are “-release” jobs, voted yes then.

Given the approval to try this out I’d like to move forward with it. @Gwmngilfen looking for some help here to setup the discourse category properly and to figure out the best way to send this data to it. What tactic do we need to take:

  1. Send email to discourse from Jenkins
  2. Build CI functionality to hit discourse API to make a post

I am assuming we will also need a user account for Jenkins to send as.

Not sure why Discourse would send email to Jenkins?

Other than that, yes, that’s about it. The list is:

Create the category (I guess as a subcategory of Development?)
Assign it an incoming address (ci@community.theforeman.org?)
Create an account for Jenkins
Figure out how to post via the API

The last one could be done in parallel with an existing account, I guess.

I don’t think Discourse has OAuth2 tokens (could be wrong) so we’ll want to be careful with the Jenkins account password, of course.

Can you send an email that targets a category vs. hitting the API?

Yeah, stupid me - I was thinking that at step 2, and then got diverted to thinking about the API at step 4 :slight_smile:

Both are viable. The email approach is easier, but of course anyone could technically mail the inbound address for the category. That’s not been the case for the dev and support categories though.

I suggest we go that way and look at the API if email gets abused for some reason

Can you handle the first three there? I am not seeing a way through my permission level to create new categories and accounts.

Yeah, that’s admin-only, I’ll sort it tomorrow.

Everything is setup to give this a test run. The last piece is the job configuration. I’ve opened a PR for the two jobs we mentioned starting with. Please indicate on the PR if there is any additional information in a given email that folks would like to see.

1 Like

This has now been fully implemented, and tested. First result:

We will be using this posts to track the failures and the investigation of said failures. If you’d like to stay informed, or contribute to fixing issues with our nightlies please keep an eye on this new category and threads within. To begin with, Foreman and Katello nightly RPM pipelines will be sending failures. Consider them the trial runs to build out the workflow and work out any issues.

1 Like

My initial thought is that we’re probably going to want tags in the topics, and possibly @group mentions in the body - as the more we add, the more there is to track. The latter will probably “just work” as Discourse parses the body, but setting tags may be tricky…

I’ve added an “Unsolved” button to the Infra category, so that you can easily filter to just open CI issues:

image

The CSS doesn’t seem to highlight the Unsolved button when you use it though, solutions welcome :slight_smile: