API performance tuning

mfuhrmann · September 6, 2019, 2:15pm

Problem: I’m querying the API and it takes really a long time to get response.

Expected outcome: Faster results

Foreman and Proxy versions: 1.22.1

Other relevant data:
I’m only slurping the host list (350 hosts) and grabbing some data. To get a response it takes about 20-30 seconds. So my question is: Can I do some optimisations to tune the API to improve the speed?

Lang_Jason · September 10, 2019, 12:46am

We often have this issue as well. There are things you can do to optimize though, a simple slurp wont help.

If you are hitting API/v2/hosts for 350 objects and it is rendering “everything” that will be slow and take a while as it gives you “everything” for each host.

Add the thin=true API argument to just return the fqdn’s. Your query will return in far less than a second then, even if it’s thousands of hosts. From there, iterate through, and query "only’ what you need from each host

/API/v2/hosts/fqdn returns ALL
/API/v2/hosts/fqdn/parameters returns just parameters (faster)
/API/v2/hosts/fqdn/parameters/paramname retusn a single host parameter (fastest)

Foreman is also able to process multiple transactions at once provided it has the resources to do so.

In our case we encourage people to multithread a fair bit through a host list returning what they need as we have something like 80 passenger process instances running which can make it much much faster.

Another alternative is do your own caching. Build a custom API to get what you want fast internally. Query foreman “slowly” on an interval (hourly) and populate your own API so you can get what you need “fast”

Generally I’ve always thought it would be nice if the hosts endpoint (or all API endpoints) had a “fields” filter to just whitelist/return what you want/need, as it would probably speed it up significantly vs returning “everything” but I have no idea how practical that is…

mfuhrmann · September 10, 2019, 6:49pm

@Lang_Jason thank you very much for the detailed answer!

Can I get the primary IP of a host with one API call using /API/v2/hosts/fqdn/parameters/paramname?

Do you have also some tips for the facts API? I need about 5-10 values. Right now I’m slurping the facts table and in my tests, it was slower when I grab the values with separate calls.

mfuhrmann · September 11, 2019, 1:46pm

/API/v2/hosts/fqdn/parameters returns just parameters (faster) does definitely not work. I don’t get any results. And /API/v2/hosts/fqdn/parameters/paramname retusn a single host parameter (fastest) too.

tbrisker · September 11, 2019, 2:07pm

You could get just the 5-10 fact you care about using search - api/fact_values?per_page=1000&search=name=osfamily e.g. will get you all the values of osfamily fact for all hosts (i used 1000 per page, but you can use any other number that makes sense for you to get all of the hosts in one go). I think this will be faster then pulling the full details for every host.

mfuhrmann · September 19, 2019, 11:48am

In my tests the approach to grab those 5-10 facts with search instead of slurping the whole fact table, resulted in 200% duration. Not sure why.

The thin=true in the host API is really good to catch a full list.
But the /enc page seems to be the one I need, since there is the field primary IP and host parameters.

ekohl · September 19, 2019, 11:57am

AFAIK /enc is the only API endpoint that actually applies ERB rendering on parameters where /hosts/FQDN/parameters returns the raw values.

I think @ofedoren is working on a bulk version of /enc for use in an Ansible inventory. This is probably also equivalent to your use case.

ofedoren · September 19, 2019, 2:16pm

After I finish what @ekohl mentioned there should be a new special report template Ansible Inventory which you can adjust to your needs (e.g. to make the report return IP and/or host parameters only).

But probably it will require Foreman Ansible plugin.