We are talking about image based provisioning for compute resources that support it, for example Azure.
Currently we have two ways to provision new host using predefined images: we can either use images with cloud-init script, or use SSH to execute the post-provisioning script.
The SSH use case is not straight-forward and is prone to errors that are hard to reflect back to the user. After talking to @ekohl , we think that this use case is not common and it would be better to remove it.
I would like to know if our assumption that the SSH is not used is indeed true, and we are not trying to remove a useful feature.
Already using the cloud compute resources is uncommon, in most cases I have seen it used with VMware when environments want to slowly migrate to unattended provisioning and in this cases cloud-init needs some preparation as most times it is not prepared in older custom images, but gave us the better result in the end.
Some user input from an organization that has just adopted finish templates/SSH post provisioning:
We are currently in the process of switching from PXE based installation to an image based approach in our (on-premise VMWare) provisioning workflows. When we had to decide between cloud-init and SSH based finished-templates, we found it way easier to use finish templates.
This might be in part because we don’t have any cloud-init know-how in house and was probably also made a lot more straight forward by having the whole SSH auth setup for REX already. I was personally not involved in that process, so I cannot speak for the exact problems we faced with cloud-init other that “we could not get it to work”, but cloud-init is a pretty complicated beast compared to a simple shell script via SSH.
So, as a personally affected user, I would love to see finish templates stay.
So in my opinion cloud-init is the way to go where it really works. But the current way to get cloud-init to work with VMware is even more complicated than the ssh one, which is why it is good to have it as fallback mechanism.
To get Cloud-init somehow to work with VMware and Proxies you have to build a mechanism inside your image which chooses the approriate proxy and pulls the cloud-init script as it is currently not mounted into the image like most other resources do.
Even if many people try to get away from vmware it will still stay for quite some time - and as long as cloud-init works only with those clunky workarounds, we need to keep the ssh-version.
On the other side even if ssh gets removed we still need the kind finish-template as it is even used in Debian/Ubuntu provisioning via PXE.
I’ve always wondered why userdata can’t be used. The compute resource integration with VMware can provide this data. Or at least, it has code for it. I just haven’t tried it myself.
There are multiple possibilities to use cloud-init with VMware - one way would be to directly reference the source in the image when creating the template, the other one is to use a userdata-template which adresses the vmware-tools (which is needed to have static network-deployments with image-based provisioning with vmware) and then provides the url somehow where to retrieve the rendered template.
However, usually cloud-init works that the template is mounted temporarily to the filesystem. Or in VMware you can also provide it as a vmware-guest-parameter in base64-encoding.
We tested many workflows and we have customers using vmware with cloud-init as well as with ssh. On the other side there is still a better implementation on the wishlist, so that cloud-init is not pulled by the template but it is already available. Then we would not need the script for the vmware-tools in userdata (which is also the only use-case why a dedicated template kind “cloud-init” exists - because in all other use-cases cloud-init is inside the userdata-template).
and with that we had to use a snippet for cloud-init, while the vm-tools reference it in the userdata-template. But I would prefer to have directly the possibility to use userdata for all hypervisors in the same way and without the dependency of vmware-tools to even use cloud-images provided by a distributor for example.
So something where the userdata template is provided directly as the guestinfo.userdata, without the workaround of an additional template.
Additionally it gets difficult when provisioning Ubuntu, as autoinstall and cloud-init use the same userdata-kind of template - but one requires a partition-table and the other not. And if you set up Ubuntu with autoinstall to convert it into a template there are additional steps required to clean cloud-init, as this is called by autoinstall before - otherwise the cloud-init would not work (and this is where cloudimages would help but they require open-vmtools first).
I would be happy if we can improve that behaviour, so that you can use userdata with cloud-init for all the hypervisors, without those extras for vmware.
But I am not sure if this discussion about VMware and cloud-init is not leading this thread to much into a different topic
I think it’s on topic. If we have a good replacement (and cloud-init certainly has the potential) then we can deprecate it. Understanding what would be needed to take away the need for SSH-based post-provisioning entirely does help.