Outofsync_interval vs #{origin.downcase}_interval

Hi everyone,
I was recently working on the outofsync_interval and I found it non-intuitive that the expected_report_interval is the sum of reported_origin_interval and default_report_interval (see here) which resolves to outofsync_interval + #{origin.downcase}_interval. I would intuitively expect that #{origin.downcase}_interval would overwrite the outofsync_interval. Is there a reason why it was solved this way?

This scope that is used in the dashboard suggests that the outofsync_interval should actually be overwritten.
(The line might also need some adjustment as it is now possible to add the outofsync_interval as host parameter) .

Can someone please clarify :slight_smile: Thanks in advance!

I’ll try to illustrate that on Puppet example. Puppet usually wakes up every 30 minutes. Then it peforms the run which takes e.g. 5 minutes. After those 5 minutes, the report comes and host status is updated. If the out of sync interval is set to 30 minutes and the report would come in 35 minutes, it would be considered out of sync. Therefore we have two intervals - 1) is the expected frequency of the regular run (reported_origin_interval) 2) the max lenght of configuration run to accept (default_report_interval).

I think when this was introduced for all cfgmgmt subsystems, the intention was to allow every subsystem to define it’s own single interval, so you’d only use reported_origin_interval, e.g. setting “Puppet interval” to 35. However there was a concern raised about backwards compatibility, so we ended up still adding default_report_interval so people could still be setting Puppet interval to 30 and Out of sync to 5. The problem is that default values remained as 35 and 30 so it does not make much sense.

If you’re touch this I think we could finally start using “#{origin}_interval” only to make this simpler. We also still fallback to the Out of sync interval in case plugin don’t define its own interval, perhaps we could update all remaining plugins if needed (salt, chef). Last note, the default_report_interval also takes host parameter into the consideration, that functionality should probably remain, but could move to the “#{origin}_interval” method.

Thank you, Marek, for this very good explanation. I think that it would be nice to clean things up and clean up a little bit. I can work on this. Let me sum up what needs to be done:
In a first step:

  1. Overwrite global “outofsync_interval” with plugin specific “#{origin}_interval” if available (I would leave the fallback for now, it can still be removed later)
  2. Move the host parameter override from global setting “outofsync_interval” to plugin specific “ansible_interval”
  3. Rename “puppet_interval”, “ansible_interval” settings to something more meaningful like “puppet_outofsync_interval”, “ansible_outofsync_interval” (–> needs an update of the Ansible plugin)
  4. Update scope for dashboard such that it takes into account host specific outofsync settings
  5. Dashboard might need some update as well as it shows only the globally set “outofsync_interval”, “ansible_interval” and “puppet_interval”

In a second step:

  1. Add “plugin_outofsync_interval” setting to remaining plugins (Salt, Chef)
  2. Remove global “outofsync_interval”?

Is there anything else I have forgot? What do you think about renaming the settings (step 3)?

3 Likes

I believe that’s the way to go :+1: I definitelly agree with switching to override even if it might break backward compatibility and make the interval shorter for some users and cofuse them.

If we also rename in the same release, it would be great as users would more easily notice we’ve touched this funcionality.

I’d just add a step to note this in release notes and preferably also demo it :slight_smile:

2 Likes