prometheus-client rubygem 1.0 has been finally released, it fixed one of the major pain-points of supporting multiple instances by sharing metric storage surprisingly in a simple file. I’ve been in touch with the developers, if you want more details there is exciting talk available from RubyConf 2019:
I’ve just bumped our RPM dependency we carry in our repos, for Debian there is nothing to wait for:
To start monitoring Foreman via Prometheus simply:
Then make Prometheus to scrape /metrics endpoint. The client library will now work under Passenger or any other forking server giving the correct numbers.
Warning: It looks like both Passenger and Puma, our web app servers, do recycle worker processes quite often. This leads to many temporary files created in relatively short period of time (hours on some deployments) which unfortunately makes /metrics endpoint slower and slower to the point it kills the whole deployment. Only restart of the app helps which cleans the temporary directory completely. Ruby client library maintainers are aware of the problem and they said it will be challenging to implement some kind of “squash” mechanism.
Edit 2020: I recommend to use statsd exporter instead of native Prometheus and use stasts_exporter Prometheus bridge to collect the data. Monitoring Foreman with Prometheus via statsd
The client library RPM update will be part of the 1.24.1 minor update. Follow this guide to enable Prometheus and scrape telemetry data from Foreman application from version 1.24.1! Thanks @tbrisker for the extra effort.
Thanks!
I’m testing this on Satellite 6(.7 snap) and it seems to work great. Anyway, is there any plan to extend this with katello-related metrics?
I think scraping some katello instance counts via fm_rails_activerecord_instances might be really helpful in regards to performance monitoring.
Hey, tnx for this.
I’ve set this up and it works on Foreman main server. Running Foreman 2.0/Katello 3.15. Is there a way to do this on smart proxies? I’m not seeing telemetry settings in /etc/foreman-proxy/settings.yml on smart proxy servers.
Also, any good grafana dashboards you could recommend?
tnx
Hey, unfortunately this was not yet implemented for smart proxy. There is no telemetry available there.
Also, there are currently no dashboards. We are currently looking into Grafana which ships with RHEL 8.x series, it has been greatly improved and there is new PCP source as well. We will likely ship a dashboard in some future. For now, create your own one and please share it with us.
That dashboard linked above is from RHEL7 Grafana, a very old version and I do not recommend it as the PCP integration does not work well.
Oki, tnx. I’ll make my own then and open source it. I’m thinking then in addition to foreman prometheus metrics I’ll combine it with node exporter metrics, explicitly systemd metrics so at least alerts work on smart proxies if a service on smart proxy fails. I’ll update this thread with links when it’s done.
Cool, we would appreciate if the implementation is same/similar as what we have in Foreman core. It is a small facade with two implementations: statsd and prometheus.
I would be also happy to accept a patch to smart_proxy_monitoring, if you want to add an additional provider. It was always meant to be open for other monitoring tools.
Of course it is also fine if you want to provide a separate feature if the goal is too different as smart_proxy_monitoring and foreman_monitoring is meant to integrate monitoring data into Foreman and automate the monitoring solution’s configuration via the Smart Proxy.
I tried to activate telemetry for Prometheus: I installed the two packages (foreman-telemetry prometheus-client), changed the config (enabled prometheus) and restarted all services. Unfortunately I can not access http://satellite/metrics: “The page you were looking for doesn’t exist.”
did you see “Unable to initialize XYZ telemetry” warning message in production.log after start? That’s what Foreman would do if there is a missing dependency.
I have just tested this on my 6.8 (alpha build) instance and it works fine. Note that you need the prometheus client library 1.0 or newer in order for this to work. This is not in 6.7 yet if I remember correctly.
Ah, sorry… blindly copied your line for my answer. But I also grepped for single words (“unable”, “telemetry”,…) and didn’t find any suspicious message.
When should I expect to see this message? After loading /metrics or after reloading/restarting foreman/httpd?
Sometimes you overlook the obvious. The prometheus option (settings.yaml) was set to false … thanks Puppet. After (re-)activating the option and restarting the services it now works.
Great get back to us how it works. We have one report from a user that the endpoint takes 1 minute to process, we don’t see that behavior so be careful and monitor the monitoring