Soliciting feedback about how the Foreman dashboard works

So currently there are two main sections where stats are reported. The
text table and the pie chart. I recently discovered that some of the
drill-down searches from the text table weren't matching the stats
reported on the dashboard. In addition some of the stats were
reporting incorrectly in certain edge cases, where there were failed
runs that had successfully applied some resources. I figured out why
and have since fixed both issues.

One additional behavior still remains for this edge case, that I would
like to solicit some feedback on.

In the pie chart, active nodes will include nodes that have failures
but also made changes in the last run. The edge case where this
becomes apparent is you have a manifest that is repeatedly applying
the same resource on each run.

So now in the pie chart now the various "slices" are:

  • No report: This is all hosts that have no reports in the system.
    (Perhaps because they aren't running puppet).
  • Out of sync: No reports in the current puppet interval. e.g. 30 mins
  • Pending changes: Only reported for hosts that are in no-op mode but
    have a change they would apply if they weren't running in no-op mode.
  • Notification disabled: Hosts that have been flagged to be excluded
    from foreman reporting
  • OK: Has checked in and no changes were required
  • Error: Puppet reported an error on the most recent run
  • Active: Changes were made in the last run. The caveat is errored
    hosts that meet this criteria, also fall into this slice.

So I guess the discrepancy is for those hosts, which meet multiple
conditions and show up in more than one slice. A host in the error
state, can show up both in the active and Notification disabled
slices. The question is does this seem ok? I ask because my instinct
was that a pie chart is a distribution chart, and that each host
should only show up in one slice at a time. It took me a bit to figure
this wasn't the case.

There would be a couple ways to address this if we so chose.

  1. Give statuses an order of precedence. e.g. Error trumps active, and
    No notification trumps everything.
  2. subdivide the subsets so there are more slices, such that each
    combination of conditions has it's own slice.

On the flip side I don't think we want to make things too confusing,
or break users' expectations.

What do you guys think?

Thanks,
Brian

> There would be a couple ways to address this if we so chose.
> 1) Give statuses an order of precedence. e.g. Error trumps active, and
> No notification trumps everything.
> 2) subdivide the subsets so there are more slices, such that each
> combination of conditions has it's own slice.

Interesting - I find myself divided on this. My gut feeling is that a
host being a member of two groups is ok for the text output, but not
for the piechart (as you say, it should add up to 100%)

However, that's going to lead to more inconsistent behaviour, which is
bad. I'm kind of used to the idea that my active hosts includes some
Error hosts, but perhaps my aversion to changing that is simply because
of what I'm used to.

I think two many slices looks bad - many will be zero, or very small
and hard to see. Thus I'd go for option one - I'll get used to it :slight_smile:

> On the flip side I don't think we want to make things too confusing,
> or break users' expectations.

Expectations can be managed. If we're clear that this is so that the
presented data is more consistent, I think most will be in favour.


Greg Sutcliffe (gwmngilfen)


OpenPGP -> KeyID: CA0AEB93

··· On Thu 03 May 2012 01:05:15 BST, Brian Gupta wrote:

> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
>> There would be a couple ways to address this if we so chose.
>> 1) Give statuses an order of precedence. e.g. Error trumps active, and
>> No notification trumps everything.
>> 2) subdivide the subsets so there are more slices, such that each
>> combination of conditions has it's own slice.
>
> Interesting - I find myself divided on this. My gut feeling is that a
> host being a member of two groups is ok for the text output, but not
> for the piechart (as you say, it should add up to 100%)
>
> However, that's going to lead to more inconsistent behaviour, which is
> bad. I'm kind of used to the idea that my active hosts includes some
> Error hosts, but perhaps my aversion to changing that is simply because
> of what I'm used to.
>
> I think two many slices looks bad - many will be zero, or very small
> and hard to see. Thus I'd go for option one - I'll get used to it :slight_smile:
>
>> On the flip side I don't think we want to make things too confusing,
>> or break users' expectations.
>
> Expectations can be managed. If we're clear that this is so that the
> presented data is more consistent, I think most will be in favour.

Based on your feedback, this is what I am thinking:

  1. If a host has notifications disabled, it will only appear in the
    "Notifications disabled" slice.
  2. If a host has an Error and doesn't have notifications disabled, it
    will apear in the "Error" slice.
  3. "Active hosts" slice will only show hosts that made a change on the
    last run and don't have notifications disabled or any errors.

Of course I will need to dig into the jscript stuff to make the search
behavior match.

Do bear in mind this will make the pie chart consistent with the host
state icon order of precedence, which I feel is a good thing.

Also bear in mind that I am thinking the code to implement this
complicates the dashboard controller a bit.

··· On Thu, May 3, 2012 at 6:49 AM, Greg Sutcliffe wrote: > On Thu 03 May 2012 01:05:15 BST, Brian Gupta wrote:

Greg Sutcliffe (gwmngilfen)


OpenPGP -> KeyID: CA0AEB93
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.19 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAk+iYssACgkQ8O7RN8oK65P0TgCgw4siha/h2/N4OF/wjbiJHNXm
HZoAniT5UGjGbbW8FF5cHX766rq5vdxA
=DKec
-----END PGP SIGNATURE-----

That all sounds good to me. Consistency of data is important. Where
does "Pending" (i.e. noop) statuses fit into this?

Greg


OpenPGP -> KeyID: CA0AEB93

··· On Tue 08 May 2012 09:41:50 BST, Brian Gupta wrote: > Based on your feedback, this is what I am thinking: > > 1) If a host has notifications disabled, it will only appear in the > "Notifications disabled" slice. > 2) If a host has an Error and doesn't have notifications disabled, it > will apear in the "Error" slice. > 3) "Active hosts" slice will only show hosts that made a change on the > last run and don't have notifications disabled or any errors. > > Of course I will need to dig into the jscript stuff to make the search > behavior match. > > Do bear in mind this will make the pie chart consistent with the host > state icon order of precedence, which I feel is a good thing. > > Also bear in mind that I am thinking the code to implement this > complicates the dashboard controller a bit.

> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
>> Based on your feedback, this is what I am thinking:
>>
>> 1) If a host has notifications disabled, it will only appear in the
>> "Notifications disabled" slice.
>> 2) If a host has an Error and doesn't have notifications disabled, it
>> will apear in the "Error" slice.
>> 3) "Active hosts" slice will only show hosts that made a change on the
>> last run and don't have notifications disabled or any errors.
>>
>> Of course I will need to dig into the jscript stuff to make the search
>> behavior match.
>>
>> Do bear in mind this will make the pie chart consistent with the host
>> state icon order of precedence, which I feel is a good thing.
>>
>> Also bear in mind that I am thinking the code to implement this
>> complicates the dashboard controller a bit.
>
> That all sounds good to me. Consistency of data is important. Where
> does "Pending" (i.e. noop) statuses fit into this?

Hmmm… yeah I forgot about that. What statuses can a noop run have?
I'm guessing we need to figure out under what conditions a host has a
"pending icon" next to it in the list, and if the pending icon ever
gets overridden by other states.

··· On Wed, May 9, 2012 at 8:36 AM, Greg Sutcliffe wrote: > On Tue 08 May 2012 09:41:50 BST, Brian Gupta wrote:

Greg


OpenPGP -> KeyID: CA0AEB93
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.19 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAk+qZL4ACgkQ8O7RN8oK65NFCACfUWCNwgDQ4RgBrgmR5P3+R8Kh
0GgAoMx/RO0MIUNhvHwpGrA+23lL941x
=szPI
-----END PGP SIGNATURE-----

Thinking about it, it's actually obvious. By definition, a host should
be Active and Pending at the same time, so those two states can
co-exist at the same point in the hierarchy.

Greg


OpenPGP -> KeyID: CA0AEB93

··· On Wed 09 May 2012 22:24:26 BST, Brian Gupta wrote: > Hmmm... yeah I forgot about that. What statuses can a noop run have? > I'm guessing we need to figure out under what conditions a host has a > "pending icon" next to it in the list, and if the pending icon ever > gets overridden by other states.

> "a host should be Active and Pending"

Ugh, need to proofread more. That should read

"a host should not be Active and Pending"

I need coffee…

Right, and if I understand correctly a noop run can't generate an
Error state either, since Erros are generated by puppet trying to do
something and failing… What about Notifications disabled? In my mind
Notifications disabled trumps all.

-Brian

··· On Thu, May 10, 2012 at 5:14 AM, Greg Sutcliffe wrote: > -----BEGIN PGP SIGNED MESSAGE----- > Hash: SHA1 > >> "a host should be Active and Pending" > > Ugh, need to proofread more. That should read > > "a host should _not_ be Active and Pending" > > I need coffee.... > > > -----BEGIN PGP SIGNATURE----- > Version: GnuPG v2.0.19 (GNU/Linux) > Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ > > iEYEARECAAYFAk+rhuYACgkQ8O7RN8oK65OEnwCfXKdmNAVgnImqyYlqI97aMmL8 > TQYAoJDS8giYrJxYGopdDRqLXpCpkLeh > =52zi > -----END PGP SIGNATURE----- >

I've seen no-op generate errors, although I'm having trouble coming up
with an example. I've seen it as a first-run error with R.I.Pienaar's
'concat' module, which relies on a script to be exec'ed across a set
of directories, which it can't do on first run/noop since the script
hasn't been created.

It's an edge case but possible, so I think we retain what we said earlier.

I'm unclear on where Out of Sync sits - I guess that trumps all but
Disabled, since it might be masking Errors/Actions/Pending-changes
that will happen when it stops being out of sync? I guess that
leaves us with, in order:

Notifications Disabled
Out Of Sync
Error
Active / Pending (it shouldn't be possible to be in both states)
OK

Greg


OpenPGP -> KeyID: CA0AEB93

··· On 10/05/12 17:18, Brian Gupta wrote: > Right, and if I understand correctly a noop run can't generate an > Error state either, since Erros are generated by puppet trying to > do something and failing.. What about Notifications disabled? In my > mind Notifications disabled trumps all.

Ok, we are on the same page. I should have time this weekend to try
and turn this into code.

-Brian

··· On Thu, May 10, 2012 at 12:28 PM, Greg Sutcliffe wrote: > -----BEGIN PGP SIGNED MESSAGE----- > Hash: SHA1 > > On 10/05/12 17:18, Brian Gupta wrote: >> Right, and if I understand correctly a noop run can't generate an >> Error state either, since Erros are generated by puppet trying to >> do something and failing.. What about Notifications disabled? In my >> mind Notifications disabled trumps all. > > I've seen no-op generate errors, although I'm having trouble coming up > with an example. I've seen it as a first-run error with R.I.Pienaar's > 'concat' module, which relies on a script to be exec'ed across a set > of directories, which it can't do on first run/noop since the script > hasn't been created. > > It's an edge case but possible, so I think we retain what we said earlier. > > I'm unclear on where Out of Sync sits - I guess that trumps all but > Disabled, since it might be masking Errors/Actions/Pending-changes > that *will* happen when it stops being out of sync? I guess that > leaves us with, in order: > > Notifications Disabled > Out Of Sync > Error > Active / Pending (it shouldn't be possible to be in both states) > OK > > Greg

This happens when it tries to access a file with source =>
'file:///foo/bar', and there is supposed to be an exec that created
that path first, but in noop the exec never ran of course. In the
concat case the exec is the script that concats all fragments to a
local file which the file resource has as path.

··· On 10 May 2012 18:28, Greg Sutcliffe wrote: > -----BEGIN PGP SIGNED MESSAGE----- > Hash: SHA1 > > On 10/05/12 17:18, Brian Gupta wrote: >> Right, and if I understand correctly a noop run can't generate an >> Error state either, since Erros are generated by puppet trying to >> do something and failing.. What about Notifications disabled? In my >> mind Notifications disabled trumps all. > > I've seen no-op generate errors, although I'm having trouble coming up > with an example. I've seen it as a first-run error with R.I.Pienaar's > 'concat' module, which relies on a script to be exec'ed across a set > of directories, which it can't do on first run/noop since the script > hasn't been created. > > It's an edge case but possible, so I think we retain what we said earlier. >


Erik Dalén
Service Reliability Engineer

Ok, I discovered one issue. As the code is currently written… The searches
in the table at top have to match the searches for the pie chart slices.
I'm not 100% understanding the code to get this behavior, but I am certain
this is the behavior.

Refactoring the code to make the data for the searches at top different
than what's in the piechart, is currently beyond my skillset, so unless
someone can help, all I can do is make the changes we discussed, and then
refactor the table to match the behavior in the piechart.

Does this seem sane?

Thanks,
Brian

··· On Thu, May 10, 2012 at 12:37 PM, Brian Gupta wrote:

On Thu, May 10, 2012 at 12:28 PM, Greg Sutcliffe gsutcliffe@ibahn.com > wrote:

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 10/05/12 17:18, Brian Gupta wrote:

Right, and if I understand correctly a noop run can’t generate an
Error state either, since Erros are generated by puppet trying to
do something and failing… What about Notifications disabled? In my
mind Notifications disabled trumps all.

I’ve seen no-op generate errors, although I’m having trouble coming up
with an example. I’ve seen it as a first-run error with R.I.Pienaar’s
’concat’ module, which relies on a script to be exec’ed across a set
of directories, which it can’t do on first run/noop since the script
hasn’t been created.

It’s an edge case but possible, so I think we retain what we said
earlier.

I’m unclear on where Out of Sync sits - I guess that trumps all but
Disabled, since it might be masking Errors/Actions/Pending-changes
that will happen when it stops being out of sync? I guess that
leaves us with, in order:

Notifications Disabled
Out Of Sync
Error
Active / Pending (it shouldn’t be possible to be in both states)
OK

Greg

Ok, we are on the same page. I should have time this weekend to try
and turn this into code.

-Brian


http://aws.amazon.com/solutions/solution-providers/brandorr/

In case it isn't clear what I am looking to do, here is a diff:

··· On Sat, May 12, 2012 at 4:41 PM, Brian Gupta wrote:

On Thu, May 10, 2012 at 12:37 PM, Brian Gupta brian.gupta@brandorr.comwrote:

On Thu, May 10, 2012 at 12:28 PM, Greg Sutcliffe gsutcliffe@ibahn.com >> wrote:

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 10/05/12 17:18, Brian Gupta wrote:

Right, and if I understand correctly a noop run can’t generate an
Error state either, since Erros are generated by puppet trying to
do something and failing… What about Notifications disabled? In my
mind Notifications disabled trumps all.

I’ve seen no-op generate errors, although I’m having trouble coming up
with an example. I’ve seen it as a first-run error with R.I.Pienaar’s
’concat’ module, which relies on a script to be exec’ed across a set
of directories, which it can’t do on first run/noop since the script
hasn’t been created.

It’s an edge case but possible, so I think we retain what we said
earlier.

I’m unclear on where Out of Sync sits - I guess that trumps all but
Disabled, since it might be masking Errors/Actions/Pending-changes
that will happen when it stops being out of sync? I guess that
leaves us with, in order:

Notifications Disabled
Out Of Sync
Error
Active / Pending (it shouldn’t be possible to be in both states)
OK

Greg

Ok, we are on the same page. I should have time this weekend to try
and turn this into code.

-Brian

Ok, I discovered one issue. As the code is currently written… The
searches in the table at top have to match the searches for the pie chart
slices. I’m not 100% understanding the code to get this behavior, but I am
certain this is the behavior.

Refactoring the code to make the data for the searches at top different
than what’s in the piechart, is currently beyond my skillset, so unless
someone can help, all I can do is make the changes we discussed, and then
refactor the table to match the behavior in the piechart.

Does this seem sane?

Thanks,
Brian


http://aws.amazon.com/solutions/solution-providers/brandorr/


http://aws.amazon.com/solutions/solution-providers/brandorr/

>
>
>> Ok, I discovered one issue. As the code is currently written… The
>> searches in the table at top have to match the searches for the pie chart
>> slices. I'm not 100% understanding the code to get this behavior, but I am
>> certain this is the behavior.
>>
>> Refactoring the code to make the data for the searches at top different
>> than what's in the piechart, is currently beyond my skillset, so unless
>> someone can help, all I can do is make the changes we discussed, and then
>> refactor the table to match the behavior in the piechart.
>>
>> Does this seem sane?
>>
>
I think consistency between the two is a good thing anyway, so it's fine
with me.

Greg

··· > On Sat, May 12, 2012 at 4:41 PM, Brian Gupta wrote: