Finding hosts in error state

I'm not sure I agree with how foreman finds hosts in an error state,
but it could be I'm misunderstanding the definition of some variables.

It seems it finds hosts that either failed to apply a resource, failed
to restart a service or if a resource was skipped. It's on skipped
resources that I'm curious why they are included for error'd systems.
The problem is that I have a resource that is scheduled. If its not
within the scheduled time its skipped. But then because of this
foreman is classifying that system as in an error state.

Please let me know if I'm not 100% right on how foreman is checking
for error'd systems. If my understanding is correct, then should it
show up as an error in the scenario I described above?

Currently I'm modifying some of the files to change the behavior.
Seems a little editing in app/controllers/hosts_controller.rb and app/
models/host.rb gets it to behave how I want (maybe not how you guys
want because perhaps I'm misunderstanding on something or my situation
is not the norm).

I did notice a bug in app/controllers/hosts_controller.rb while
fooling around with it. It seems when you are on the dashboard page
and click on the number by "hosts in error state" it brings you to the
hosts page with a query that was ran to show error systems. I'm not
exactly sure the query used to bring the hosts up intially, but the
query in the search box is not what's being used. If I click the
search button my results disappear. This is because in app/
controllers/hosts_controller.rb we have 'last report > 65 AND (…)':

def errors
params[:search]="last_report > "#{SETTINGS[:puppet_interval] + 5}
minutes ago"
and (status.failed > 0 or status.failed_restarts > 0 or
status.skipped >0)"
show_hosts Host.recent.with_error, "Hosts with errors"
end

and I think what is wanted is 'last report > 65 OR (…)':

def errors
params[:search]="last_report > "#{SETTINGS[:puppet_interval] + 5}
minutes ago"
or (status.failed > 0 or status.failed_restarts > 0 or status.skipped
>0)"
show_hosts Host.recent.with_error, "Hosts with errors"
end

After making that change I can bring up the page from dashboard, then
hit the 'search' button with the pre-filled query and it brings up the
same results.

So please give me your thoughts, and if you want a bug(s) created let
me know also.

Thanks,
Jake

I may be wrong about the last_report AND/OR bit … It seems to be
working as expected now … so I guess I misunderstand that bit. I'm
still playing around with it to see if I can reproduce my previous
issue or not.

Comments on skipped resources would still be appreciated though.

Thanks,
Jake

··· On May 20, 1:08 pm, Jake - USPS wrote: > I'm not sure I agree with how foreman finds hosts in an error state, > but it could be I'm misunderstanding the definition of some variables. > > It seems it finds hosts that either failed to apply a resource, failed > to restart a service or if a resource was skipped. It's on skipped > resources that I'm curious why they are included for error'd systems. > The problem is that I have a resource that is scheduled. If its not > within the scheduled time its skipped. But then because of this > foreman is classifying that system as in an error state. > > Please let me know if I'm not 100% right on how foreman is checking > for error'd systems. If my understanding is correct, then should it > show up as an error in the scenario I described above? > > Currently I'm modifying some of the files to change the behavior. > Seems a little editing in app/controllers/hosts_controller.rb and app/ > models/host.rb gets it to behave how I want (maybe not how you guys > want because perhaps I'm misunderstanding on something or my situation > is not the norm). > > I did notice a bug in app/controllers/hosts_controller.rb while > fooling around with it. It seems when you are on the dashboard page > and click on the number by "hosts in error state" it brings you to the > hosts page with a query that was ran to show error systems. I'm not > exactly sure the query used to bring the hosts up intially, but the > query in the search box is not what's being used. If I click the > search button my results disappear. This is because in app/ > controllers/hosts_controller.rb we have 'last report > 65 AND (...)': > > def errors > params[:search]="last_report > \"#{SETTINGS[:puppet_interval] + 5} > minutes ago\" > and (status.failed > 0 or status.failed_restarts > 0 or > status.skipped >0)" > show_hosts Host.recent.with_error, "Hosts with errors" > end > > and I think what is wanted is 'last report > 65 OR (...)': > > def errors > params[:search]="last_report > \"#{SETTINGS[:puppet_interval] + 5} > minutes ago\" > or (status.failed > 0 or status.failed_restarts > 0 or status.skipped>0)" > > show_hosts Host.recent.with_error, "Hosts with errors" > end > > After making that change I can bring up the page from dashboard, then > hit the 'search' button with the pre-filled query and it brings up the > same results. > > So please give me your thoughts, and if you want a bug(s) created let > me know also. > > Thanks, > Jake

> I'm not sure I agree with how foreman finds hosts in an error state,
> but it could be I'm misunderstanding the definition of some variables.
>
> It seems it finds hosts that either failed to apply a resource, failed
> to restart a service or if a resource was skipped. It's on skipped
> resources that I'm curious why they are included for error'd systems.
> The problem is that I have a resource that is scheduled. If its not
> within the scheduled time its skipped. But then because of this
> foreman is classifying that system as in an error state.
>
You are probably right… it seems that skipped (in failure context) would
always be triggered via another failure (such as failed to install a
package), in which case it would be in failed more anyway.

Unless anyone else disagree, we will change the existing behavior.

>
> I did notice a bug in app/controllers/hosts_controller.rb while
> fooling around with it. It seems when you are on the dashboard page
> and click on the number by "hosts in error state" it brings you to the
> hosts page with a query that was ran to show error systems. I'm not
> exactly sure the query used to bring the hosts up intially, but the
> query in the search box is not what's being used. If I click the
> search button my results disappear. This is because in app/
> controllers/hosts_controller.rb we have 'last report > 65 AND (…)':
>

Yes, this is Bug #900: searching on time is not correct - Foreman, and a fix has been already
created, probably it would be merged tomorrow, there is a minor discussion
going on if we should use the browser timezone or the servers…

Thanks,
Ohad

··· On Fri, May 20, 2011 at 9:08 PM, Jake - USPS wrote:

So more testing and something is different between the query filled
into the search box and what is actually used to initially populate
the page. Right now looking at "hosts that had performed
modifications" I got 3 systems initially, if I hit search with
"last_report > "65 minutes ago" and (status.applied > 0 or
status.restarted > 0)" I get 1 system. If I change it to "last_report
> "65 minutes ago" or (status.applied > 0 or status.restarted > 0)"
and search again I get 4 systems.

Regards,
Jake

··· On May 20, 1:21 pm, Jake - USPS wrote: > I may be wrong about the last_report AND/OR bit ... It seems to be > working as expected now ... so I guess I misunderstand that bit. I'm > still playing around with it to see if I can reproduce my previous > issue or not. > > Comments on skipped resources would still be appreciated though. > > Thanks, > Jake > > On May 20, 1:08 pm, Jake - USPS wrote: > > > > > > > > > I'm not sure I agree with how foreman finds hosts in an error state, > > but it could be I'm misunderstanding the definition of some variables. > > > It seems it finds hosts that either failed to apply a resource, failed > > to restart a service or if a resource was skipped. It's on skipped > > resources that I'm curious why they are included for error'd systems. > > The problem is that I have a resource that is scheduled. If its not > > within the scheduled time its skipped. But then because of this > > foreman is classifying that system as in an error state. > > > Please let me know if I'm not 100% right on how foreman is checking > > for error'd systems. If my understanding is correct, then should it > > show up as an error in the scenario I described above? > > > Currently I'm modifying some of the files to change the behavior. > > Seems a little editing in app/controllers/hosts_controller.rb and app/ > > models/host.rb gets it to behave how I want (maybe not how you guys > > want because perhaps I'm misunderstanding on something or my situation > > is not the norm). > > > I did notice a bug in app/controllers/hosts_controller.rb while > > fooling around with it. It seems when you are on the dashboard page > > and click on the number by "hosts in error state" it brings you to the > > hosts page with a query that was ran to show error systems. I'm not > > exactly sure the query used to bring the hosts up intially, but the > > query in the search box is not what's being used. If I click the > > search button my results disappear. This is because in app/ > > controllers/hosts_controller.rb we have 'last report > 65 AND (...)': > > > def errors > > params[:search]="last_report > \"#{SETTINGS[:puppet_interval] + 5} > > minutes ago\" > > and (status.failed > 0 or status.failed_restarts > 0 or > > status.skipped >0)" > > show_hosts Host.recent.with_error, "Hosts with errors" > > end > > > and I think what is wanted is 'last report > 65 OR (...)': > > > def errors > > params[:search]="last_report > \"#{SETTINGS[:puppet_interval] + 5} > > minutes ago\" > > or (status.failed > 0 or status.failed_restarts > 0 or status.skipped>0)" > > > show_hosts Host.recent.with_error, "Hosts with errors" > > end > > > After making that change I can bring up the page from dashboard, then > > hit the 'search' button with the pre-filled query and it brings up the > > same results. > > > So please give me your thoughts, and if you want a bug(s) created let > > me know also. > > > Thanks, > > Jake

Hi,

That sounds in fact like a good reason to change the behavior.
@Jake, ticket ?

Cheers,
Marcello

··· -----Original Message----- From: foreman-users@googlegroups.com [mailto:foreman-users@googlegroups.com] On Behalf Of Ohad Levy Sent: zaterdag 21 mei 2011 19:30 To: foreman-users@googlegroups.com Subject: Re: [foreman-users] Finding hosts in error state

On Fri, May 20, 2011 at 9:08 PM, Jake - USPS gaferion@gmail.com wrote:

I'm not sure I agree with how foreman finds hosts in an error state,
but it could be I'm misunderstanding the definition of some

variables.

It seems it finds hosts that either failed to apply a resource,

failed
to restart a service or if a resource was skipped. It’s on skipped
resources that I’m curious why they are included for error’d
systems.
The problem is that I have a resource that is scheduled. If its not
within the scheduled time its skipped. But then because of this
foreman is classifying that system as in an error state.

You are probably right… it seems that skipped (in failure context) would
always be triggered via another failure (such as failed to install a
package), in which case it would be in failed more anyway.

Unless anyone else disagree, we will change the existing behavior.

Well I would officially wait for Paul or Ohad to reply. But this sounds like a bug. It would be nice to know if a module is scheduled rather than skipped. However, this could be a puppet related issue rather than foreman. I would jump on the puppet-users group and see if puppet does in fact report when modules are marked scheduled. If puppet is passing info to foreman that a module is schedule it may just be that foreman needs another column called "Scheduled".

Corey

··· On May 20, 2011, at 11:35 AM, Jake - USPS wrote:

So more testing and something is different between the query filled
into the search box and what is actually used to initially populate
the page. Right now looking at “hosts that had performed
modifications” I got 3 systems initially, if I hit search with
"last_report > “65 minutes ago” and (status.applied > 0 or
status.restarted > 0)" I get 1 system. If I change it to "last_report

“65 minutes ago” or (status.applied > 0 or status.restarted > 0)"
and search again I get 4 systems.

Regards,
Jake

On May 20, 1:21 pm, Jake - USPS gafer...@gmail.com wrote:

I may be wrong about the last_report AND/OR bit … It seems to be
working as expected now … so I guess I misunderstand that bit. I’m
still playing around with it to see if I can reproduce my previous
issue or not.

Comments on skipped resources would still be appreciated though.

Thanks,
Jake

On May 20, 1:08 pm, Jake - USPS gafer...@gmail.com wrote:

I’m not sure I agree with how foreman finds hosts in an error state,
but it could be I’m misunderstanding the definition of some variables.

It seems it finds hosts that either failed to apply a resource, failed
to restart a service or if a resource was skipped. It’s on skipped
resources that I’m curious why they are included for error’d systems.
The problem is that I have a resource that is scheduled. If its not
within the scheduled time its skipped. But then because of this
foreman is classifying that system as in an error state.

Please let me know if I’m not 100% right on how foreman is checking
for error’d systems. If my understanding is correct, then should it
show up as an error in the scenario I described above?

Currently I’m modifying some of the files to change the behavior.
Seems a little editing in app/controllers/hosts_controller.rb and app/
models/host.rb gets it to behave how I want (maybe not how you guys
want because perhaps I’m misunderstanding on something or my situation
is not the norm).

I did notice a bug in app/controllers/hosts_controller.rb while
fooling around with it. It seems when you are on the dashboard page
and click on the number by “hosts in error state” it brings you to the
hosts page with a query that was ran to show error systems. I’m not
exactly sure the query used to bring the hosts up intially, but the
query in the search box is not what’s being used. If I click the
search button my results disappear. This is because in app/
controllers/hosts_controller.rb we have ‘last report > 65 AND (…)’:

def errors
params[:search]="last_report > "#{SETTINGS[:puppet_interval] + 5}
minutes ago"
and (status.failed > 0 or status.failed_restarts > 0 or
status.skipped >0)"
show_hosts Host.recent.with_error, "Hosts with errors"
end

and I think what is wanted is ‘last report > 65 OR (…)’:

def errors
params[:search]="last_report > “#{SETTINGS[:puppet_interval] + 5}
minutes ago"
or (status.failed > 0 or status.failed_restarts > 0 or status.skipped>0)”

show_hosts Host.recent.with_error, "Hosts with errors"

end

After making that change I can bring up the page from dashboard, then
hit the ‘search’ button with the pre-filled query and it brings up the
same results.

So please give me your thoughts, and if you want a bug(s) created let
me know also.

Thanks,
Jake


You received this message because you are subscribed to the Google Groups “Foreman users” group.
To post to this group, send email to foreman-users@googlegroups.com.
To unsubscribe from this group, send email to foreman-users+unsubscribe@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/foreman-users?hl=en.

http://theforeman.org/issues/930

Thanks,
Jake

··· On May 21, 6:15 pm, "Marcello de Sousa" wrote: > Hi, > > That sounds in fact like a good reason to change the behavior. > @Jake, ticket ? > > Cheers, > Marcello > > > > > > > > -----Original Message----- > From: foreman-users@googlegroups.com [mailto:foreman-users@googlegroups.com] > > On Behalf Of Ohad Levy > Sent: zaterdag 21 mei 2011 19:30 > To: foreman-users@googlegroups.com > Subject: Re: [foreman-users] Finding hosts in error state > > On Fri, May 20, 2011 at 9:08 PM, Jake - USPS wrote: > > I'm not sure I agree with how foreman finds hosts in an error state, > but it could be I'm misunderstanding the definition of some > variables. > > It seems it finds hosts that either failed to apply a resource, > failed > to restart a service or if a resource was skipped. It's on skipped > resources that I'm curious why they are included for error'd > systems. > The problem is that I have a resource that is scheduled. If its not > within the scheduled time its skipped. But then because of this > foreman is classifying that system as in an error state. > > You are probably right... it seems that skipped (in failure context) would > always be triggered via another failure (such as failed to install a > package), in which case it would be in failed more anyway. > > Unless anyone else disagree, we will change the existing behavior.

I'm looking at the report for the node and it doesn't seem to show
that it was scheduled for the resource that is being skipped:

"Exec[install_opsware]": !ruby/object:Puppet::Resource::Status
  change_count: 0
  changed: false
  events: []
  failed: false
  file: /etc/puppet/development/modules/common/manifests/

opsware.pp
line: 20
out_of_sync: false
out_of_sync_count: 0
resource: "Exec[install_opsware]"
resource_type: Exec
skipped: true
tags:
- exec
- install_opsware
- class
- common::opsware
- common
- opsware
- node
- default
time: 2011-05-20 14:38:46.842404 -05:00
title: install_opsware

So it looks like you are right in that it would first need to be
reported by puppet before foreman could even know if its a 'good' or
'bad' skip.

I'll look into this more when I'm in on Monday.

Thanks,
Jake

··· On May 20, 2:43 pm, Corey Osman wrote: > Well I would officially wait for Paul or Ohad to reply. But this sounds like a bug. It would be nice to know if a module is scheduled rather than skipped. However, this could be a puppet related issue rather than foreman. I would jump on the puppet-users group and see if puppet does in fact report when modules are marked scheduled. If puppet is passing info to foreman that a module is schedule it may just be that foreman needs another column called "Scheduled". > > Corey > On May 20, 2011, at 11:35 AM, Jake - USPS wrote: > > > > > > > > > So more testing and something is different between the query filled > > into the search box and what is actually used to initially populate > > the page. Right now looking at "hosts that had performed > > modifications" I got 3 systems initially, if I hit search with > > "last_report > "65 minutes ago" and (status.applied > 0 or > > status.restarted > 0)" I get 1 system. If I change it to "last_report > >> "65 minutes ago" or (status.applied > 0 or status.restarted > 0)" > > and search again I get 4 systems. > > > Regards, > > Jake > > > On May 20, 1:21 pm, Jake - USPS wrote: > >> I may be wrong about the last_report AND/OR bit ... It seems to be > >> working as expected now ... so I guess I misunderstand that bit. I'm > >> still playing around with it to see if I can reproduce my previous > >> issue or not. > > >> Comments on skipped resources would still be appreciated though. > > >> Thanks, > >> Jake > > >> On May 20, 1:08 pm, Jake - USPS wrote: > > >>> I'm not sure I agree with how foreman finds hosts in an error state, > >>> but it could be I'm misunderstanding the definition of some variables. > > >>> It seems it finds hosts that either failed to apply a resource, failed > >>> to restart a service or if a resource was skipped. It's on skipped > >>> resources that I'm curious why they are included for error'd systems. > >>> The problem is that I have a resource that is scheduled. If its not > >>> within the scheduled time its skipped. But then because of this > >>> foreman is classifying that system as in an error state. > > >>> Please let me know if I'm not 100% right on how foreman is checking > >>> for error'd systems. If my understanding is correct, then should it > >>> show up as an error in the scenario I described above? > > >>> Currently I'm modifying some of the files to change the behavior. > >>> Seems a little editing in app/controllers/hosts_controller.rb and app/ > >>> models/host.rb gets it to behave how I want (maybe not how you guys > >>> want because perhaps I'm misunderstanding on something or my situation > >>> is not the norm). > > >>> I did notice a bug in app/controllers/hosts_controller.rb while > >>> fooling around with it. It seems when you are on the dashboard page > >>> and click on the number by "hosts in error state" it brings you to the > >>> hosts page with a query that was ran to show error systems. I'm not > >>> exactly sure the query used to bring the hosts up intially, but the > >>> query in the search box is not what's being used. If I click the > >>> search button my results disappear. This is because in app/ > >>> controllers/hosts_controller.rb we have 'last report > 65 AND (...)': > > >>> def errors > >>> params[:search]="last_report > \"#{SETTINGS[:puppet_interval] + 5} > >>> minutes ago\" > >>> and (status.failed > 0 or status.failed_restarts > 0 or > >>> status.skipped >0)" > >>> show_hosts Host.recent.with_error, "Hosts with errors" > >>> end > > >>> and I think what is wanted is 'last report > 65 OR (...)': > > >>> def errors > >>> params[:search]="last_report > \"#{SETTINGS[:puppet_interval] + 5} > >>> minutes ago\" > >>> or (status.failed > 0 or status.failed_restarts > 0 or status.skipped>0)" > > >>> show_hosts Host.recent.with_error, "Hosts with errors" > >>> end > > >>> After making that change I can bring up the page from dashboard, then > >>> hit the 'search' button with the pre-filled query and it brings up the > >>> same results. > > >>> So please give me your thoughts, and if you want a bug(s) created let > >>> me know also. > > >>> Thanks, > >>> Jake > > > -- > > You received this message because you are subscribed to the Google Groups "Foreman users" group. > > To post to this group, send email to foreman-users@googlegroups.com. > > To unsubscribe from this group, send email to foreman-users+unsubscribe@googlegroups.com. > > For more options, visit this group athttp://groups.google.com/group/foreman-users?hl=en.

Check on the puppet-users group as I just asked about this question.

Corey

··· On May 20, 2011, at 1:10 PM, Jake - USPS wrote:

I’m looking at the report for the node and it doesn’t seem to show
that it was scheduled for the resource that is being skipped:

“Exec[install_opsware]”: !ruby/object:Puppet::Resource::Status
change_count: 0
changed: false
events: []
failed: false
file: /etc/puppet/development/modules/common/manifests/
opsware.pp
line: 20
out_of_sync: false
out_of_sync_count: 0
resource: "Exec[install_opsware]"
resource_type: Exec
skipped: true
tags:
- exec
- install_opsware
- class
- common::opsware
- common
- opsware
- node
- default
time: 2011-05-20 14:38:46.842404 -05:00
title: install_opsware

So it looks like you are right in that it would first need to be
reported by puppet before foreman could even know if its a ‘good’ or
’bad’ skip.

I’ll look into this more when I’m in on Monday.

Thanks,
Jake

On May 20, 2:43 pm, Corey Osman co...@logicminds.biz wrote:

Well I would officially wait for Paul or Ohad to reply. But this sounds like a bug. It would be nice to know if a module is scheduled rather than skipped. However, this could be a puppet related issue rather than foreman. I would jump on the puppet-users group and see if puppet does in fact report when modules are marked scheduled. If puppet is passing info to foreman that a module is schedule it may just be that foreman needs another column called “Scheduled”.

Corey
On May 20, 2011, at 11:35 AM, Jake - USPS wrote:

So more testing and something is different between the query filled
into the search box and what is actually used to initially populate
the page. Right now looking at “hosts that had performed
modifications” I got 3 systems initially, if I hit search with
"last_report > “65 minutes ago” and (status.applied > 0 or
status.restarted > 0)" I get 1 system. If I change it to "last_report

“65 minutes ago” or (status.applied > 0 or status.restarted > 0)"
and search again I get 4 systems.

Regards,
Jake

On May 20, 1:21 pm, Jake - USPS gafer...@gmail.com wrote:

I may be wrong about the last_report AND/OR bit … It seems to be
working as expected now … so I guess I misunderstand that bit. I’m
still playing around with it to see if I can reproduce my previous
issue or not.

Comments on skipped resources would still be appreciated though.

Thanks,
Jake

On May 20, 1:08 pm, Jake - USPS gafer...@gmail.com wrote:

I’m not sure I agree with how foreman finds hosts in an error state,
but it could be I’m misunderstanding the definition of some variables.

It seems it finds hosts that either failed to apply a resource, failed
to restart a service or if a resource was skipped. It’s on skipped
resources that I’m curious why they are included for error’d systems.
The problem is that I have a resource that is scheduled. If its not
within the scheduled time its skipped. But then because of this
foreman is classifying that system as in an error state.

Please let me know if I’m not 100% right on how foreman is checking
for error’d systems. If my understanding is correct, then should it
show up as an error in the scenario I described above?

Currently I’m modifying some of the files to change the behavior.
Seems a little editing in app/controllers/hosts_controller.rb and app/
models/host.rb gets it to behave how I want (maybe not how you guys
want because perhaps I’m misunderstanding on something or my situation
is not the norm).

I did notice a bug in app/controllers/hosts_controller.rb while
fooling around with it. It seems when you are on the dashboard page
and click on the number by “hosts in error state” it brings you to the
hosts page with a query that was ran to show error systems. I’m not
exactly sure the query used to bring the hosts up intially, but the
query in the search box is not what’s being used. If I click the
search button my results disappear. This is because in app/
controllers/hosts_controller.rb we have ‘last report > 65 AND (…)’:

def errors
params[:search]="last_report > "#{SETTINGS[:puppet_interval] + 5}
minutes ago"
and (status.failed > 0 or status.failed_restarts > 0 or
status.skipped >0)"
show_hosts Host.recent.with_error, "Hosts with errors"
end

and I think what is wanted is ‘last report > 65 OR (…)’:

def errors
params[:search]="last_report > “#{SETTINGS[:puppet_interval] + 5}
minutes ago"
or (status.failed > 0 or status.failed_restarts > 0 or status.skipped>0)”

show_hosts Host.recent.with_error, "Hosts with errors"

end

After making that change I can bring up the page from dashboard, then
hit the ‘search’ button with the pre-filled query and it brings up the
same results.

So please give me your thoughts, and if you want a bug(s) created let
me know also.

Thanks,
Jake


You received this message because you are subscribed to the Google Groups “Foreman users” group.
To post to this group, send email to foreman-users@googlegroups.com.
To unsubscribe from this group, send email to foreman-users+unsubscribe@googlegroups.com.
For more options, visit this group athttp://groups.google.com/group/foreman-users?hl=en.


You received this message because you are subscribed to the Google Groups “Foreman users” group.
To post to this group, send email to foreman-users@googlegroups.com.
To unsubscribe from this group, send email to foreman-users+unsubscribe@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/foreman-users?hl=en.