RFC: Pulp 4 preperation - migration from hrefs to PRNs
Context and Problem Statement
Pulp 4 is somewhere on the horizon, and hrefs are being replaced by PRNs (Pulp Resource Names). In Katello, we store references to Pulp hrefs for all Pulp entities that need keeping track of. For example, repository version hrefs are stored on repository records in Katello so we know which Pulp repository matches to which Katello repository.
We need to stop using hrefs and instead use PRNs.
Proposal
I’m proposing to perform the PRN migration within a single release. Originally I thought it would need to happen across multiple releases due to how slow it would be to look up each Pulp entity via the API. However, the PRN values can be computed.
Example href: /pulp/api/v3/repositories/rpm/rpm/0198a4fd-deac-75fe-8942-a8fbc8476481/
Matching PRN: prn:rpm.rpmrepository:0198a4fd-deac-75fe-8942-a8fbc8476481
The UUID values will always match between the href and the PRN. Each href prefix also matches up nicely to a PRN prefix. With this information, we can use a static prefix mapping from href to PRN to compute the values.
The only special case is repository versions - they look like /pulp/api/v3/repositories/rpm/rpm/0198a4fd-deac-75fe-8942-a8fbc8476481/versions/7/
. The issue is that the version href only include the repository UUID, not the version UUID. So, for repository versions only, we’ll need to look the PRN up via the API (or, if we’re really needing speed, via a direct connection to the Pulp DB).
The steps to develop the migration would thus be:
- Begin indexing PRNs on Katello records with Pulp hrefs
- Populate PRNs for existing records
- Remove href fields and begin using PRNs for all Pulp entities
Testing:
- Measure the performance of the PRN migration on older hardware to ensure it truly is fast enough for a single upgrade
- Update Robottelo tests to stop relying on hrefs
Once Pulp 4 is out, Katello will be already using PRNs, so there will be no concern about href fields no longer being available.
Alternative Designs
Looking up all PRN values by the API is an alternative as mentioned before. It is likely much slower since “computing” the PRN will just require looking up hrefs in the Katello DB, performing string manipulation following the mapping, and inserting the new PRN values. If we fetch the values via the Pulp API, we have to go through Pulp’s entire stack. The only benefit is that it would be simpler logic. However, a lookup table for href->PRN is relatively simple as well, and even if the resource names change, we only need to maintain the mapping for a single release.
Decision Outcome
…
Impacts
…