Community bug testing/triage day - proposed Mon 29th feb

Hi all,

I'm starting to organise a bug squashing day for the near future, and
wanted to post my thoughts for some feedback before I take it to the users
list.

Some events have converged that make this really attractive:

  • 1.11 release candidates start soon, and as usual we have a list of bugs
    that need attention
  • Discussions at recent conferences seemed to point towards bug days as a
    good thing to be doing
  • Downstream: QE will be holding a bug day of their own and would be
    interested in joining forces.

Taking the last point first - QE have committed their people to Monday 29th
February as a bug day, and I think we can get enough people on board for
that date, so long as I do the publicity this week.

From an organising point of view, here's my thoughts:

  • Invite anyone who wants to triage or test bugs to be a part of this
  • Make sure there are plenty of the full-time devs available to help (see
    below)
  • Use a dedicated IRC channel (#theforeman-bugs) to keep spam away from #tf
  • Use an etherpad as the central doc, containing
    • List of bugs being worked on, and by who
    • Suggested bugs to look at for those who want some direction
    • Setup notes / access details for test systems (I'll set up 1-2 systems
      on public OS1)
    • Who to talk to for help (i.e who's available to support people right
      now)
    • other stuff?
  • Possibly run some hangouts for those wanting to chat real-time, use
    screenshare to discuss bugs, or get help with something tricky

I have confirmation that all the Red Hat devs can spend a whole day on this
so long as it's on Redmines that have an associated downstream Bugzilla
(so, nothing new there really). This gives us a good chunk of experienced
devs in different timezones who can be the backbone of the bug day.

Everyone else can, of course, work on whatever they wish - but we should
maintain a list of useful bugs needing looking at in the etherpad. That
way, we have a fast response when someone asks "What should I work on?".
Dominic's list is clearly the starting point for this, as these things
should really be tackled soon.

I suggest we run the event for the whole day - that's starting in the
morning TLV-time and running through to the end of the Raleigh workday.
It's a long day, but it means people can take breaks, and this gives us the
most windows for community people to contribute at a time that suits them.

Done right, this gives us the chance to fix a ton of stuff for 1.11, verify
other stuff is definitely fixed (for a nice stable release) and spend time
hacking together (a virtual equivalent of the recent Foreman Construction
Day, if you will).

Thoughts? What did I miss?

··· -- Greg IRC: gwmngilfen

> From an organising point of view, here's my thoughts:
>
> * Invite anyone who wants to triage or test bugs to be a part of this
> * Make sure there are plenty of the full-time devs available to help
> (see below)
> * Use a dedicated IRC channel (#theforeman-bugs) to keep spam away from #tf

Yes, thanks.

> * Use an etherpad as the central doc, containing
> * List of bugs being worked on, and by who
> * Suggested bugs to look at for those who want some direction
> * Setup notes / access details for test systems (I'll set up 1-2
> systems on public OS1)
> * Who to talk to for help (i.e who's available to support people right
> now)
> * other stuff?

Last time this happened (without any warning), it caused problems on our
CI infrastructure as PRs were being updated from reviews incredibly
frequently, even before the tests from the last run had finished. We
don't have capacity for bursts. You'll need to add a lot more capacity
to compensate (at least double it I'd say), or to disable or throttle PR
tests for the day.

Personally I found it very disruptive to the open PR reviews against
core as we went from a well-maintained queue to being tens of PRs behind
within a few hours, so I'd prefer that people do rebases and updates to
already-open PRs in the weeks beforehand, spreading out the work.

··· On 18/02/16 11:44, Greg Sutcliffe wrote:


Dominic Cleal
dominic@cleal.org

Would it help to switch to manual testing with [test] during the day?

··· On Thu, Feb 18, 2016 at 12:12:21PM +0000, Dominic Cleal wrote: > Last time this happened (without any warning), it caused problems on our > CI infrastructure as PRs were being updated from reviews incredibly > frequently, even before the tests from the last run had finished. We > don't have capacity for bursts. You'll need to add a lot more capacity > to compensate (at least double it I'd say), or to disable or throttle PR > tests for the day. > > Personally I found it very disruptive to the open PR reviews against > core as we went from a well-maintained queue to being tens of PRs behind > within a few hours, so I'd prefer that people do rebases and updates to > already-open PRs in the weeks beforehand, spreading out the work.

Yes, that might work. I don't know that we have a switch for that in
the PR testing script we use, apart from changing the configured rules
about members of which GH organisation can trigger tests. It might need
a patch (c.
https://github.com/theforeman/test-pull-requests/blob/master/test_pull_requests#L542).

··· On 18/02/16 12:42, Ewoud Kohl van Wijngaarden wrote: > On Thu, Feb 18, 2016 at 12:12:21PM +0000, Dominic Cleal wrote: >> Last time this happened (without any warning), it caused problems on our >> CI infrastructure as PRs were being updated from reviews incredibly >> frequently, even before the tests from the last run had finished. We >> don't have capacity for bursts. You'll need to add a lot more capacity >> to compensate (at least double it I'd say), or to disable or throttle PR >> tests for the day. >> >> Personally I found it very disruptive to the open PR reviews against >> core as we went from a well-maintained queue to being tens of PRs behind >> within a few hours, so I'd prefer that people do rebases and updates to >> already-open PRs in the weeks beforehand, spreading out the work. > > Would it help to switch to manual testing with [test] during the day?


Dominic Cleal
dominic@cleal.org

> >> Last time this happened (without any warning), it caused problems on our
> >> CI infrastructure as PRs were being updated from reviews incredibly
> >> frequently, even before the tests from the last run had finished. We
> >> don't have capacity for bursts. You'll need to add a lot more capacity
> >> to compensate (at least double it I'd say), or to disable or throttle PR
> >> tests for the day.
> >>
> > Would it help to switch to manual testing with [test] during the day?
>
> Yes, that might work. I don't know that we have a switch for that in
> the PR testing script we use, apart from changing the configured rules
> about members of which GH organisation can trigger tests. It might need
> a patch (c.
>
> https://github.com/theforeman/test-pull-requests/blob/master/test_pull_requests#L542
> ).
>

I'll try to take a look at that, but it sounds like we can have a fallback
position of disabling the tests for a few hours - obviously that would mean
we can't merge, but we can get any patches well reviewed and ready to go if
tests pass. Either way, this isn't a blocker for doing a bug day, right?

>> Personally I found it very disruptive to the open PR reviews against
>> core as we went from a well-maintained queue to being tens of PRs behind
>> within a few hours, so I'd prefer that people do rebases and updates to
>> already-open PRs in the weeks beforehand, spreading out the work.

I hear the problem, and it's a good point. I'm not quite following the
proposal though - are you suggesting that we try to finish up the open PRs
before the bug day? Or should we avoid in-progress PRs as part of this
event? If an existing PRs author and reviewer get together during the day
and work through the issues, that's a positive thing surely?

Cheers,
Greg

··· On 18 February 2016 at 13:45, Dominic Cleal wrote: > On 18/02/16 12:42, Ewoud Kohl van Wijngaarden wrote: > > On Thu, Feb 18, 2016 at 12:12:21PM +0000, Dominic Cleal wrote:

>
> >> Last time this happened (without any warning), it caused problems on our
> >> CI infrastructure as PRs were being updated from reviews incredibly
> >> frequently, even before the tests from the last run had finished. We
> >> don't have capacity for bursts. You'll need to add a lot more capacity
> >> to compensate (at least double it I'd say), or to disable or throttle PR
> >> tests for the day.
> >>
> > Would it help to switch to manual testing with [test] during the day?
>
> Yes, that might work. I don't know that we have a switch for that in
> the PR testing script we use, apart from changing the configured rules
> about members of which GH organisation can trigger tests. It might need
> a patch (c.
> https://github.com/theforeman/test-pull-requests/blob/master/test_pull_requests#L542).
>
>
> I'll try to take a look at that, but it sounds like we can have a
> fallback position of disabling the tests for a few hours - obviously
> that would mean we can't merge, but we can get any patches well reviewed
> and ready to go if tests pass. Either way, this isn't a blocker for
> doing a bug day, right?

I'm not saying don't do it, just that one of those options needs to be
done before starting.

>>> Personally I found it very disruptive to the open PR reviews against
>>> core as we went from a well-maintained queue to being tens of PRs behind
>>> within a few hours, so I'd prefer that people do rebases and updates to
>>> already-open PRs in the weeks beforehand, spreading out the work.
>
> I hear the problem, and it's a good point. I'm not quite following the
> proposal though - are you suggesting that we try to finish up the open
> PRs before the bug day?

Yes, partly - if contributors have PRs just waiting for small updates or
rebases, I'd ask them to try and get to it earlier and not wait until
this one day to do it.

··· On 18/02/16 17:40, Greg Sutcliffe wrote: > On 18 February 2016 at 13:45, Dominic Cleal > wrote: > On 18/02/16 12:42, Ewoud Kohl van Wijngaarden wrote: > > On Thu, Feb 18, 2016 at 12:12:21PM +0000, Dominic Cleal wrote:


Dominic Cleal
dominic@cleal.org

> > I'll try to take a look at that, but it sounds like we can have a
> > fallback position of disabling the tests for a few hours - obviously
> > that would mean we can't merge, but we can get any patches well reviewed
> > and ready to go if tests pass. Either way, this isn't a blocker for
> > doing a bug day, right?
>
> I'm not saying don't do it, just that one of those options needs to be
> done before starting.
>

I was just checking that the allback position was acceptable. I'll take a
look at PR processor.

> >>> Personally I found it very disruptive to the open PR reviews against
> >>> core as we went from a well-maintained queue to being tens of PRs
> behind
> >>> within a few hours, so I'd prefer that people do rebases and updates to
> >>> already-open PRs in the weeks beforehand, spreading out the work.
> >
> > I hear the problem, and it's a good point. I'm not quite following the
> > proposal though - are you suggesting that we try to finish up the open
> > PRs before the bug day?
>
> Yes, partly - if contributors have PRs just waiting for small updates or
> rebases, I'd ask them to try and get to it earlier and not wait until
> this one day to do it.

Right, that makes sense - it's wasted time on the bug to tackle things that
are effectively already solved anyway. I'll add it to my notes.

I'm not hearing any other blockers, so I'll draft up something for -users
later today. Thanks for the input guys.

··· On 19 February 2016 at 08:07, Dominic Cleal wrote: