Happy new year all,
I would like to start a discussion around our the centos7-devel box and its reliability. For Katello developers, being able to successfully spin up the box is a critical part of our workflow, we use centos7-devel box almost exclusively for developing. Lately, the box has not been spinning up successfully and we have had this issue many times in the past.
I know development environments are an ever-changing beast, but I think there is room for improvement. I’m going to try to lay out the problem as clearly as I can and hopefully that helps us discuss ways to improve the reliability of the centos7-devel box.
Forklift’s centos7-devel box does does not reliably spin up successfully each time you use it (vagrant up centos7-devel).
Forklift’s centos7-devel box reliably spins up close to %100 of the time
Why this is a problem
- Developers can’t confidently spin up a new environment when they need it
- A lot of time is wasted debugging the broken devel environment.
- A lot of time is wasted spinning up a new environment to replace a broken/outdated one, only to have this fail. This means a developer is blocked until things are fixed or has to use old/buggy environments.
- Centos7-devel is used not only by developers but also QE, community members, and others involved in Foreman/Katello development
Areas of improvement
I see three areas that we can improve. Feel free to use these as discussion points or add your own. I have some thoughts around these, but I will add them separately to try to keep this post general and unbiased.
How do we prevent the provisioning of the devel environment from being broken in the first place? Something changes in development or a development-adjacent area and now a step in the provisioning of centos7-devel is broken.
How do we know when the centos7-devel environment is broken?
Diagnosing and fixing
How do we know specifically what went wrong and how do we fix the issue? Who has this responsibility?
Knowledge that would be helpful to share
I know many of the working parts in provisioning a dev server are a black box to me. I think it would be helpful if anyone could share more information about these topics so we all have a better understanding of our tooling that would affect the centos7-devel box (feel free to request more)
- The relation between the nightly pipeline and dev environments
- How our installer works and how our various puppet modules create our development installer
- Various third-party dependencies that could affect provisioning and how we use them (rvm, node, etc…)
I hope this helps facilitate some discussion. Please note that I don’t think this thread is appropriate for debugging why the box is broken currently, rather I would like to look at the long-term stability of the box. I am looking forward to hearing everyone’s responses. Could 2019 be the year we confidently vagrant up our dev server?