Generation of cached API doc and dynamic content types

Background

In Foreman and every plugin with an API uses apipie-rails. This describes the API (locations, methods, parameters). It can also generate human readable documentation that’s served on /apidoc and what we also provide on apidocs.theforeman.org.

To do this, Foreman generates apidocs and ships them in its package. The same is done for every plugin.

Then after installation the indexes are regenerated (using the apipie:cache:index rake task). This is because there’s an index page with all endpoints that can only be assembled once you know which plugins are present. There is also a JSON file which has the complete description, which again can only be generated with all plugins present.

This is also the reason why it’s generally recommended to visit /apidoc on your own instance rather then our apidocs: they document your installed plugins.

The problem

Katello behaves differently depending on which content types are present. This means some parts are only visible if Pulp is installed with Debian or OSTree support. For the UI this works well, because that’s generated dynamically anyway.

The problem appears in the API documentation. During package building it’s not known yet which content types are available. That means it’s incomplete.

An example is always clearer:

This means RepositoryTypeManager.removable_content_types(false).map(&:label) is executed during packaging and only the types available during packaging are visible by default. There may be more examples.

In https://github.com/theforeman/puppet-foreman_proxy_content/pull/393 we at least set up some events that when a content type is added the index is refreshed, but realistically this isn’t going to cover everything.

Solutions

Generate with all options by default in Katello

It is possible to make Katello generate all options. This could be achieved by pretending all types are present during packaging. This would generate complete API documentation, but it may be too broad.

This could be seen as a workaround.

Rebuild API documentation on the target system

It is possible to run the apipie:cache (which is what packaging does). This is slow: on my machine with Katello it took about 50 seconds per language plus some other common overhead. It would also need to be done after every Foreman/plugin package installation which makes it very expensive.

Not a solution I’d like to see due to the cost for end user systems

Disable apipie cache

Essentially this is like a caching problem. apipie-rails can generate the apidocs on the fly. This is what happens in development. It’s technically a bit slower, but given how little the apidocs are used it’s probably an acceptable overhead.

Perhaps the JSON view (as used by Hammer, FAM and other apipie clients) could be cached by Foreman using its regular caching system (file based by default, possibly memcache or Redis).

This would also have faster package builds and smaller packages (thus faster installation) as an advantage.

Rewrite API docs to use the JSON document as a source

We already generate a JSON document that describes the whole API. Perhaps a Javascript based UI could be built instead of using ERB templates. That would remove the need to ship so many files. It places the cost of rendering on the user’s browser.

This sounds like a costly rewrite that would take time.

Conclusion

I’m leaning to investigate option 3 (disabling the cache) but perhaps I’m missing another solution. I’ll readily admit I’m not too familiar with it.

1 Like

I tested this on my production system by disabling the cache, and could not tell the difference in speed of page rendering the API docs from cached to not cached. My vote would be to drop caching all together.

Not only would dropping caching solve the inherent problem but it would, as noted, speed up builds by a significant amount. In some cases, cache generation can take ~10-15 minutes of build time. Further this would also reduce complexity, bring parity to environments and generally make the operations we do a lot (build and install) faster for developers and users.

My test process:

vim config/initializers/apipie.rb 

Then update line 42:

  config.use_cache = false #Rails.env.production? || File.directory?(config.cache_dir)

And for good measure delete cache on disk and restart services:

rm -rf /usr/share/foreman/public/apipie-cache
systemctl restart foreman

This made me remember Bug #33956: serve assets directly via Apache, not via Puma/Rails - Installer - Foreman. We’re already serving the request via Puma now rather than directly via Apache (which would be faster). Perhaps that’s why you hardly notice a difference.

Currently there is code in Foreman that tries to detect that the cache is stale and tells users to regenerate it. This should be removed if we decide to not cache anymore.

Generate with all options by default in Katello

I’d say that it’s acceptable, but this solves only one small problem just for now. As you’ve said, it’s a workaround and my opinion on that is we could do that in the case we don’t have time/desire to go for a different solution.

Rebuild API documentation on the target system

Unfortunatelly we would need to do that at least for JSON files due to hammer (well, hammer consumes only one file with whole documentation and it doesn’t take much time to generate AFAIR).

NOTE: Just hammer thoughts, can be skipped :slight_smile: I have an issue (Feature #28283: Autodetect if apidoc cache needs reload - Hammer CLI - Foreman) on my current TODO, which is more about ensuring hammer does reload a new version of API doc JSON file if it’s present, but now I think that we could ensure that if hammer sees that API docs changed (due to installation of a new plugin for example) then it will either print a warning Detected changes in the API, please run foreman-rake apipie:cache and after hammer --reload-cache for the update. or since a user would need to run the rake task anyway, we could do that automatically if hammer --reload-cache is used. Just to clarify how I see it:

  • We have apipie checksum which we compare in hammer to see if there was a change in the API. [OK]
  • If we receive a different checksum then we make hammer ask for a new JSON file (this assumes that a new file was generated by the installer or a user). [OK]
  • If a new file wasn’t generated by a user nor by the installer and the API has changed, hammer will ask for and receive the same old JSON file each time a user runs a command due to ensuring we always reload the cache automatically. [NOT OK] Due to this I’d show that warning or only Detected changes in API, please run hammer --reload-cache.
  • If --reload-cache is used, make sure the server runs something like apipie:cache just for JSON, so hammer can actually download a new version. [OK]

Disable apipie cache

Well, I didn’t test this for quite a long time in production mode to see if it’s fast enough if we have Katello + other plugins installed, but if disabling will speed other things up by a lot then I’d go for it. Moreover, we wouldn’t need to deal with caching for plugins/hammer. Although, we still would need cache for hammer to not reload it each time and somehow deal with computation of the apipie checksum.

Rewrite API docs to use the JSON document as a source

Well, since the main problem is HTML generating, I’d also vote for JS/React based UI (I’ve already tried to use JSON file for autocompletion in templates editor). This could also fix few small issues we currently have (such as few broken animations), we wouldn’t need to update bootstrap in apipie-rails.

Summary

Actually, I’d consider a hybrid solution based on Disable apipie cache and UI for API from Foreman. The issues that come to my mind just for now:

  • Disabling cache means revisiting hammer/apipie-bindings/Foreman to make sure it works as before.
  • Ensuring JSON file contains all the needed information to build the same/nicer UI, since we don’t have same power as we do with Rails helpers in ERB (apipie relies on that heavily).
2 Likes

IMHO both can go hand in hand. We could disable the cache as a short term fix while I think rewriting the UI would take a bit longer. So I think we’re in agreement about your summary.

Given the timing, other priorities and current capacity I don’t think we can get this in Foreman 3.2 but it would be good to start early in 3.3 to have sufficient time to stabilize it.

1 Like

I would even vote for making the API documentation generation as lazy as possible, i.e. moving it to the actual documentation request.
Use case we want to consider:
Assuming you have a plugin that introduces a facet to a host. Now let’s assume you are also adding scoped_search definitions to the host object based on that facet.
Since currently the documentation is generated upon loading the controller class and we do not control when the class is actually loaded, we can have a race condition here:
The scoped_search definitions would require the plugin to be already loaded for the documentation to be generated correctly.
The controller is generic, hence it can be referenced (and hence loaded) by other plugins during the initialization phase. In this case our plugin is not yet loaded and the information for the documentation is not yet complete.

1 Like

I have (attempted) to capture the outputs here into a Redmine with a set of tasks for the projects that are affected:

I think the consensus here is this has all upsides with a couple caveats to ensure do not have any gaps in production. If there is anything missing, whether it be a story or information please do add it to the ticket.

This RFC has been completed per Tracker #34639: Stop generating api-pie cache and rely on dynamic generation of API docs - Foreman.

Apipie caching is no longer enabled, and caches are no longer generated for Foreman or plugins. Apipie documentation will now be generated on-demand whenever a user requests it. This saves around 1 minute at packaging build time and around 2 minutes in the installer. These are small savings that add up over time due to how often we run the installer and perform package builds for Foreman and Katello.

The detailed summary of changes:

  • Apipie caching (Apipie DSL was not touched) has been disabled in Foreman
  • The apipie:cache:index rake task has been removed from puppet-foreman
  • All packaging has been updated to no longer generate an apipie cache in Foreman or plugins across RPM and DEB

During the process of rebuilding there were some plugins that were identified as not being able to be built due to unrelated reasons. They are:

RPM

  • rubygem-foreman_openscap
  • rubygem-foreman_acd
  • rubygem-foreman_snapshot_management
  • rubygem-foreman_leapp

DEB

  • ruby-foreman-statistics

RPM:

DEB: