Transparent Content View for Limiting Repository-Access in Katello

1. Overview

With the introduction of Simple Content Access (SCA) in Katello, the traditional method of limiting content views based on used subscriptions is no longer possible. This RFC proposes a new feature, the “Transparent Content View,” which allows users to limit content based on specific products / repositories.

2. Problem Statement

Since the implementation of SCA, several issues have been identified:

  • Removal of Subscription-Based Content View Limiting: Previously, the accessible content on a host could be limited based on the subscriptions. With SCA, all repositories are now available by default in the default organization view, making it impossible to limit access to specific repositories.
  • Default Organization View and Library State of Repositories: Some users prefer to use the default organization view because they do not want to manage content views; their goal is to deliver the latest and greatest packages without additional overhead. These users rely on the library state of repositories, which always provides the most up-to-date content. However, with SCA, the default organization view now includes all repositories without the possibility to filter or limit repositories based on specific needs, leading to undesired repository access.

3. Proposed Solution: “Transparent Content View”

To address these issues, we propose the introduction of a “Transparent Content View” feature in Katello. This feature would allow finer control over repository access without the complexity of traditional content view management.

What is a Transparent Content View

  • A “Transparent Content View” is a new type of content view that effectively mirrors the library state of repositories.
  • It provides access to the latest and greatest packages, similar to how users currently utilize the default organization view.
  • Unlike the current default organization view under SCA, the Transparent Content View can be limited to specific products / repositories (as usual in Content Views)
  • This content view type would provide a way to specify which repositories are included or excluded, enabling more granular control over content access.

The “Transparent Content View” is proposed as a flexible and user-friendly solution to manage repository access limitations introduced with SCA, aligning with user needs and the future direction of Katello.

3 Likes

This is an interesting idea!

So if I’m understanding it right, you’d have a new type of content view where you’d simply choose the repositories you want, and you’d get the library instances of those repositories only.

This content view could not be published or promoted, and would work just like the Default Organization View except that it would only contain a subset of its repositories.

It would also update all its content immediately upon sync, without the need to publish, just like Default Organization View.

@sajha This seems like it may be relatively easy to implement in Katello, without need of any changes in Candlepin. Thoughts?

Absolutely right.

I guess, this can only be an option for a “Content View”. If this option is set, things like “Promote/Publish” and Versions are not shown as it always delivers the Library state.

If you think further on, it would even be possible to create Composite Content Views - or, to make it maybe simpler, to prevent Creating Composite Content Views from Transparent Content Views.

Seems like an interesting idea. The advantage of using Default Organization view in this context seems to be no overhead to publish/promote when a repo is synced. We could introduce some sort of automation to do the same for content views.

We have something similar for composite content views where the component CV can be set to “Latest version” and a new version of component CV automatically publishes the composite. Default org view works similarly in that it always points to latest version of a repository. We could set up other CVs to follow the same behavior.

Yes, we had the same idea.

Actually, it need to be updated automatically to the Library content if one of the repositories of the transparent CV was synced.

3 Likes

Instead of introducing a new CV type, I wonder if the following would be cleaner:

In my activation key for any given CV, I can select a new “Latest” environment. A host using such a key will get the same repository version as they would get from the default organization view (library instance), but only if that repository is in the relevant CV.

In the long run one could even represent this in UI for CVs:

  • If there is new content in the library instance version of a repo in the CV, which is not yet in the latest published version of the CV, the CV will display an extra version named “next”. This version always displays as promoted to environment “Latest”. Publishing a new version turns what is currently “next” into “Version N+1”.
  • If there is no new content (i.e. publishing a new version would not create a new CV version), then the latest version N is displayed as having environment “Latest”.

=> IMHO the new environment “Latest” is what the “Library” environment always should have been.
Using any CV with environment “Latest” is exactly equivalent to creating and using a “Transparent Content View” as proposed in this thread, but I can simply use my existing CVs and don’t need to create a second “transparent” variant.

Okay, but how would content view filters be handled in that case?

also, I’d prefer to keep activation keys out of this. The cardinal rule of AKs is that they exist only to assign hosts attributes at registration time, and nothing about them is “sticky” or relevant after registration.

Filters cannot affect the “latest-next” environment-CV-version combination. It would use the same repos as the “Default Organization View”. As a matter of fact those are not filtered. This might be unintuitive to users, which is an argument for the “transparent CV” concept (which simply does not allow any filters, dependency resolution, incremental updates etc.)

I don’t think I understand your point about AKs. I am not proposing any special role for AKs after registration. I was just describing the UI based workflow of assigning content to hosts, which starts with an AK.

From the point of view of the user who prefers to use the “Default Organization View”, but needs a way to restrict it to a set of relevant repositories (which is the feature we are discussing), a CV is just a collection of repositories (of which I always want the latest synced version, which precludes filters etc.), and my user question is: Why can’t I just have that for an arbitrary CV? Why do I need to create a special kind of “Transparent CV”?

But I do recognize the double usage of: “you can use this CV in the normal CV way with filters, versions, and all the rest” or you can use it in the “transparent way (latest synced repo version, no filters)” could be confusing to users. So it is a question of which is worse. As a user, I either need to understand why there is yet another CV type, or I need to understand the different usage types of the single CV type.

If we go for the new “Transparent CV” type, I would keep that as simple as possible to start with. i.e. no filters, no versions, no adding it to a CCV, no publishing nothing! It is simply a collection of repositories, the latest synced version of which can be presented to hosts.

So I think we’re kind of all on the same page here. This new type of content view would share a lot of properties with the Default Organization View. Just thinking about the technical implications -

Katello applies a sort of “special treatment” to the Default Organization View, wherein it gets the behavior we’re all describing here - no filters, no publishing, no versions, content is updated immediately on repository syncs. I’m not 100% clear on how all this is achieved, since that’s more on the Pulp side of things (maybe @sajha can say more). I think Default Org View may actually have publishes and versions under the hood, that are not exposed to users (?).

Also the Default Org view cannot be added to a composite content view. I was going to say it cannot be exported either, but it can be exported as part of a Library export. So that would be one difference between Default Org View and this new special kind of CV.

And what we’re considering now is more from the perspective of what makes sense to the user - how should I be able to take advantage of this new kind of content view? Should I have to explicitly create it, or should it be some sort of option when assigning hosts a content view environment, and Katello would have to pick that up and do the rest. If the latter, we would have to add it not only to activation keys, but also to the registration logic of hosts. And then registration_manager would have to be in charge of spawning or exposing this new “latest-next” CV during registration. And that may be an expensive or time-consuming task (@sajha, thoughts here?). Now that I type all this, it seems like a bit much. Maybe it would be simpler to use the explicit CV creation idea instead. That way you could separate registration with whatever’s involved in creation and “publishing” of this new environment.

What I was thinking is more along the lines of a checkbox on CV (Or child repositories) that says “Always update to latest” and that would mimic the “Always update to latest” behavior on CCVs today where a new version of repository will immediately publish a new version of the CV. We do need to have a CV here to support clients consuming the content by adding it to candlepin DB at a path that’s different from Default org view.

We could have some features like Default Organization view which only has one version at all times so once the latest version is published, the older versions and the archived versions are also removed etc.

1 Like

Makes sense. I think that making this granular to the individual repository level, even though technically possible, may be too confusing. But if we could have a checkbox on a content view itself (or a new type of CV), then all its repositories could “know” to behave this way.

Having recently worked on structured APT I am confident I know exactly how this works on the Pulp side of things (and mostly how it works on the Katello side of things): When a user creates a repository within a Product (before any CVs exist), the following is created: A Katello root repository, the (library instance) Katello repository associated with that root repository, and a Pulp repository, publication, and distribution, which are referenced by the library instance Katello repository. When the user syncs some content, it goes directly into this initial Pulp repository, a new Pulp publication is created from the new content, and the Pulp distribution is updated with this new publication.

The Katello “Default Organization View” simply serves these initial sync repos to hosts directly. There is no chance of applying filters, the state is always the latest synced state, and no additional Pulp or Katello side actions need to be taken, beyond the Katello sync itself. All needed entities are already available immediately after sync. (I am a little hazy on how the needed Candlepin entities to communicate the “Default Organization View” to hosts are created.)

This is part of the attraction of the “Default Organization View” for users. They don’t want the overhead of publishing a CV version (which involves creating a second Pulp repository, and copying content from the library instance Pulp repository to the new CV version Pulp repository, applying filters in the process, all to freeze that state for ever, at significant cost). Adding an auto publish option to the existing CVs does not satisfy our users requirements. What they are asking for is: “Default Organization View”, but limited to a set of repositories of our choice, instead of the very inflexible “all repositories that exist in Katello”. This does not require any creation of new entities, expensive publishes or similar, in either Pulp or Katello. I think it does require creating some things in Candlepin (which should be pretty cheap), but apart from that it should be very lightweight, and just require an implementation that queries existing DB records on the Katello side.

The only input information the new implementation would need is “what is the set of repositories?” that users want served in the latest synced state. This information could be provided by any existing CV (or even CCV), because those are clearly associated with a set of repositories. (This is what I tried to pitch when I joined this conversation :wink:). It could also be provided by a new type of “Transparent CV” like @Bernhard_Suttner pitched in his original post. The difference is entirely one of UX, but both variants should be pretty similar as a matter of back end implementation.

From a UX perspective I see advantages and disadvantages to both variants: The “transparent CV” adds another type (as far as the user is concerned), and may force some users to maintain the same repo set both as a normal and then again as a transparent CV, which sounds annoying. On the other hand, adding an extra option/special use case to existing CVs risks making those seem even more complex to users than they are already, which could create confusion (including for users who don’t care about “Default Organization View”/transparent CV). I am genuinely unsure which I prefer. I am sure what I want the back-end implementation to be, which should be the same for both variants.

2 Likes

My apologies for my walls of text, but I feel I have not yet been as clear as I could be. This morning I went over the UI together with @Bernhard_Suttner and we came to a consensus on what we think the best possible solution is (given what already exists within Katello). It combines his “Transparent CV” concept with some of my suggestions for the backend implementation. I will try to describe this latest state of deliberations as clearly as I can.

Background on how the status quo works:

Before I can start I need to describe the status quo of how the “Default Organization View” works:

When a user creates a new repository within some product, Katello creates a Katello::RootRepository record as well as a Katello::Repository record, that is referred to as “the Library instance of this repository”. The latter references a newly created Pulp repository and Pulp distribution. You can look at the Pulp publication in your browser by following the “Published At” link from the Katello repository UI page. When you sync some content to the Katello repository, This is synced into the existing Pulp repo, a new Pulp publication is created from the result, and the existing Pulp distribution is updated with the new publication. If you now follow the “Published At” link you will see the new content. The published at URL is structured as follows: https://<your_instance>/pulp/content/<org>/Library/custom/<product>/<repo>/. It always serves your latest synced state.

When you tell a host to use “Library” + “Default Organization View” as its content source, the host will consume that very same “Published At” URL that you can see on the repository page. The very same Pulp repo that Katello asks Pulp to sync content into, which means you automatically always get the latest synced state. In Katello terminology, the host is consuming the “Library instance repository, associated with the given root repository”.

The “Library” environment for a normal CV (not the “Default Organization View”) is very different. It requires you to create a CV version, which will then automatically be promoted to “Library”. Under the hood this creates two extra Katello::Repository records (one for the CV version, and one for the CV-Library environment combination), which are in turn associated with newly created Pulp entities. The CV version repository record gets a new Pulp repository and content is then copied from the “library instance” Pulp repository to that new Pulp repo (CV filters are applied as part of the copy action). Katello will also have Pulp create a new Pulp publication and Pulp distribution from the new repo. The CV-Library repository record does not get new Pulp entities, instead it is associated with the same Pulp entities as the CV version repository. The difference is that the version repository will never have those Pulp associations updated again (it represents a frozen content state), while the Library repo will be updated if Library is later promoted to a different CV version. Hosts ultimately consume the Pulp repo referenced by the CV-Library Katello repository instance.

Terminology: I find it useful to distinguish between “latest-Library” which is the Library environment as used by the “Default Organization View”, and “CV-Library” which is the Library environment as used for a normal CV.

The use case:

Some users like using “latest-Library”, because it is simple, new syncs are immediately available to hosts, and they do not care about the ability to roll back to old CV versions, apply filters, etc.

The problem:

Currently the only way to use “latest-Library” is via the “Default Organization View”. However, the “Default Organization View” automatically contains all the repositories that exist in Katello. Before SCA it was possible to limit this via the subscriptions available to some host. With SCA this is no longer possible. Users can still disable any repos they do not care about, but that still leads to significant irritation for users that have content for multiple different OSes and OS versions in Katello. It simply does not make sense to configure my “Alma Linux 8” repos on my “Rocky Linux 7” hosts even if they are correctly disabled. On SLES, various zypper list commands will list all the irrelevant disabled repositories by default. Depending on how many OSes and OS versions I have I might have hundreds of disabled repositories along with the 5 enabled repositories I actually want.

The proposal:

Let’s have a “transparent CV”, that can be associated with arbitrary Katello repositories (user choice) just like a normal CV. Unlike a normal CV, the transparent CV does not have any Versions, Filters, or History and is only ever available in the “Library” environment, just like the “Default Organization View”. Unlike the “Default Organization View” hosts using “Library” with some transparent CV, only get the repositories that are in the transparent CV. Just like with the “Default Organization View”, hosts will consume the Pulp distribution that is linked to from the Katello repository page via the “Published At” URL.

This has the enormous advantage, that no new Pulp repositories are ever needed for a transparent CV, avoiding the expensive operation of copying content from one Pulp repo to another. In fact, no new Katello or Pulp entities of any kind are needed. The only new entities that are needed, are some candlepin entities whenever a new transparent CV is created, or has a repository added or removed to it by the user. The candlepin content for the singel root repository associated with any given Katello repository already exists, and does not need to be created. Each transparent CV needs to create a single Candlepin Environment (different from a Katello Environment) when it is first created. Whenever a repository is added to a transparent CV by the user, the relevant candlepin content needs to be added to the relevant candlepin environment. When a repo is removed, the candlepin content needs to be removed from the candlepin environment. That should be it.

The necessary UI changes are also pretty minimal. The CV creation box needs to list a third possible type (“Transparent CV”). Once created a transparent CV looks exactly like a normal CV, except that the Versions, Filters, and History tabs, as well as the “Publish new version” button is missing. There are no new Katello environments, since Transparent CVs will use “Library” just like the “Default Organization View” uses “Library”. When creating an Activation Key, or changing the content source on a content host, the only thing that has changed, is that once a user selects “Library” for the environment, the drop down list of CVs will include any transparent CVs that exist.

We think this makes for a very clean design, that does not add any additional complexity to existing CVs or CCVs, and creates a bare minimum of new entities.

Note: Once this proposal is implemented, the “Default Organization View” would be functionally equivalent to a transparent CV, that the user has added all repositories to.

1 Like

One more note on the two alternate proposals that were made during the course of this thread:

  • My idea of “simply” using existing CVs and allowing them to be used as a “transparent CV” as well, turned out to be not so simple on the details:
    • Since normal CVs already have a “Library” environment this would have required yet another environment, let’s call it “Latest” to access the “transparent CV version” within the normal CV. We would then be stuck eternally explaining to users what the difference between “Library” for CVs, “Library” for “Default Organization View” and “Latest” is. It would also have created UI challenges on the page creating AKs and changing the content source for hosts (because of the new environment).
    • In addition we would need to make clear to users, that even though there are filters, etc. in the CV, those are not applied to hosts using “Latest”, which would almost certainly create additional user confusion.
  • @sajha’s Idea of adding an option to automatically creating a new CV version (which would then be promoted to Library) whenever new content is available, would satisfy the requirements for what content we want delivered to hosts, but at the cost of frequently publishing + promoting new CV versions. This is an expensive operation. I could see some users of normal CVs using this feature if it was added. But users of the “Default Organization View” who just want a way to restrict the set of repos that is configured would almost certainly hate all the overhead from all that publishing and promoting (many chose to use the “Default Organization View” to avoid this in the first place). Since there is a more lightweight alternative by way of the proposed “Transparent CV”, we should go for that IMO.

Now the fun part: What to call it?

  • Transparent content view
  • Simplified content view
  • Diet content view
  • Living content view
  • Live content view
0 voters

Oh, :grinning: I asked chat gpt how to name this feature. Suggestions:

  • Rolling Content View
  • Live Content View

I like Rolling Content View

Maybe “Mirror content view” because it just mirrors what was synced.

Maybe just “Simple content view” instead of “Simplified content view”? Or is that to similar to “Simple content access”.

Or maybe “Dynamic content view” (cause it does not have fixed versions).

Oh. I like “Rolling content view”!

3 Likes

“Rolling content view” does a nice job of giving some idea about the purpose. The content is ‘rolling’ in that it changes with the Default Organization View content.

I just had a random thought – in the planning of this, we’ll need to make sure that import/export doesn’t accept these new kinds of content views, right?

Under the hood, all exports are content view version exports. The options you have when you export are

  • Library - creates a (semi-)invisible “Export-Library” content view and version, and exports that
  • Version - exports a content view version
  • Repository - creates an invisible content view version with only the single repo, and exports that

Since rolling content views aren’t going to have content view versions, I’m thinking we may not need any manual adjustments to existing export APIs. However, if we wanted to allow users the ability to export rolling content views, we could do that, in a similar way to what we do with Library today. That would just be a bit of additional work.