Many open source projects are adopting guidelines on how to use AI for the benefit of their communities. With this RFC, I’d like to start a conversation about whether the Foreman community could benefit from an AI usage policy, and if so, what that policy could look like.
Proposal
I’d like to center the conversation around two main areas:
If the Foreman community adopts an AI policy, what principles should it be based upon?
If the Foreman community adopts an AI policy, what form should it take?
Looking around and taking inspiration from other communities and conversations, there are many options.
When it comes to principles, people consider licensing, fair use, security aspects of AI-based contributions, environmental impact, transparency, and many others.
However, I’d like to hear from you who contribute to the Foreman/Katello project. What do you consider important? What is your priority? What do you think would be the best fit for Foreman?
Alternative Designs
N/A
Decision Outcome
TBD
Impacts
The expected (but still only potential) impact is to agree upon, develop, and then merge a document or statement summarizing the Foreman project’s approach to AI-assisted or AI-generated contributions.
NOTE: This post was not AI-generated or AI-assisted
I’ll start with my own motivation: My priority is to encourage responsible use of AI. On that note, I appreciated Fedora’s call to “Embrace your human side”, which suggests to avoid using AI for minor items like polishing one’s own messages (AI policy in Fedora - WIP - Fedora Discussion).
A significant part of responsible use of AI is also allowing each other to learn from our experiments and mistakes, which is where disclosing whether AI was used for a contribution comes in. This is because if I learn that another contributor has used AI and I can learn from his experience, there is a chance I’ll be able to build on that experience rather than run the same experiments and make the same mistakes on my own.
Also, I believe the goal is not and should not be to discourage AI usage for contributions. If anyone discloses that their PR was AI-generated or AI-assisted, that contribution should be fairly reviewed by the other members of the community, just like any other contribution.
Thanks for putting some structured thought into this.
As someone who is slow to adopt LLM assistance, I don’t have much to contribute on the details, but I do stand to benefit disproportionately from in your words: “allowing each other to learn from our experiments and mistakes”. So thanks!
I also want to second the “just like any other contribution.” point. If the contribution provides good work, who am I to complain that an LLM was involved. If there are issues with the contribution, then that is what reviewers should say in their review.
As someone that is/was skeptical about AI/LLMs usage, I’m starting to use more and more in the agentic approach.
For assisted tasks/PR I have feeling that we need to be explicit on requesting that contributors state in the commit message/signature that the code was written with help of one model/llm. I’m also in favor of us adding AI definitions in the projects, to ensure that IDEs that auto enable models understand that we have rules for contributions that are AI based.
This is excellent and relevant reading. I’d definitely keep the policy very open. It should rather focus on recommendations, one of them being to denote a contribution generated or semi-generated with AI. Code completion is IMHO not worth tagging in any way.
Thanks to everyone who has chimed in to the conversation so far!
There are a few major themes I’m seeing here:
General interest in facilitating knowledge sharing over how AI is used
Preference in not being prescriptive (or restrictive) about the use of AI, which should stay a matter of choice
Disclosing the usage of AI helps preserve clarity and community trust
However, when it comes to only requiring disclosure for contributions that use AI in a “non-trivial” way, that obviously raises an important question: Where to draw the line between “trivial” and “substantial”? Having AI generate code is surely substantial. Should AI auto-completion, on the other hand, be considered trivial? What about querying an LLM for ideas during the brainstorming phase of a PR, before you start writing the code?
If we don’t answer this question, we might also end up with basically every PR being marked as AI-assisted, and that would just generate unhelpful noise.
Where I’m going with this is that it might indeed be beneficial to agree on a formal policy document that will explain these things, and more. Which brings us back to the question of where to store that policy along with how to ensure contributors are aware of it and have interest in following it to the best of their ability.
In all of these, AI was used to summarize existing documentation to produce introductions to various sections in our documentation.
One useful thing that I’ve observed is that if as a reviewer I am aware of the fact that a PR is AI-assisted, and then during the review I notice a pattern, I can ask the author to adjust their prompt before performing a detailed line-by-line review.
Also, sharing the prompt (or its summary) can show a reviewer what the author considered and what the priorities for the PR were. Again, then the review can start with discussing the prompt and perhaps tweaking it, before a detailed line-by-line review.
In my mind, this further supports the benefits of disclosing AI usage.