Monitoring Foreman Performance with pcp

Hello!

I have tried configuring the monitoring of Foreman performance with Performace Co-Pilot (PCP) by following the documentation and I am having problems with it.

When I do this on an almost empty Foreman, everything seems to work fine.

However, on systems with a little bit more load (but perfectly functional ones) it works for a few days until the redis database crashes which then causes the dynflow-sidekiq workers to also crash.

I am attaching the redis log.

After disabling PCP everything goes back to working fine like before. I have tried this on two separate instances so far.

I have two questions:

  1. Is anybody else using PCP like in the documentation on a system with some load on it and can confirm or deny my issues?
  2. By configuring PCP like in the documentation, pmproxy writes its data to the same redis server as Foreman. Couldn’t this be problematic?

Thank you very much in advance!

Ottavia

redis.log (27.1 KB)

2 Likes

I looked at PCP, and said nope. I am using the built in telemetry. Which isnt as much as I want, but its plenty to monitor and alert on performance issues:

2 Likes

Since it sounds like PCP is a no-go with the performance issues, @Ottavia would you mind posting an issue to Issues · theforeman/foreman-documentation · GitHub with details about the problems you are having and perhaps some info about the size of your installation? We can use that issue as a starting point for improving the integration that Foreman advertises. I could then work on getting the issue in front of whoever maintains the PCP integration these days.

I’ve asked around and heard this: “If they stop the collection of hotproc (process-related) metrics they most likely solve the performance degradation”.

The caveat of course is if you need process-related metrics, then this is a problem. Regardless, I’ve confirmed that you’re seeing a known performance issue with PCP.

2 Likes

Thanks for the feedback! I have opened an issue: Monitoring Foreman performance with PCP · Issue #4757 · theforeman/foreman-documentation · GitHub .

I will try disabling the collection of hotproc metrics and report back! Keep in mind that this will probably take a while, since the problems did not appear immediately in my experience.