As Rackspace has ended it’s support for open source projects we would like to move everything off of Rackspace an onto new providers. See below for a detailed layout of the current Rackspace infrastructure.
If you have questions or concerns please raise them here.
Proposal
There are three main components to the proposal:
Moving DNS to a centrally managed provider
Keeping secure infrastructure on a single provider
Maintain current level of Jenkins nodes
DNS
Gandi provides open source support and centralized group management. Accounts will be setup there and DNS transferred from Ohad to Gandi. Anyone within the infrastructure team that wants access may create an account and be added to the group.
Jenkins Nodes
The non-security focused Jenkins nodes (slave01 and slave05 on Rackspace) will be shutdown and replaced by AWS Jenkins nodes that are copies of the two AWS nodes we have today.
Main Infrastructure
The main and security focused infrastructure is:
Foreman/Puppet master
Jenkins master
Webserver
Debian Jenkins node
slave02 that handles specific jobs that push into our infrastructure
These servers will be moved alongside existing infrastructure running in Oregon State University Open Source Lab (OSUOSL) on their Openstack infrastructure. This will require a capacity increase on their footprint which they are happy to do.
I am less sure about doing the Scaleways and Netways ones properly. The debian01 and slave02 left on Rackspace will get updated as part of their move to OSUOSL.
I’ve spun up a node05 at OSUOSL that will replace slave02.rackspace once we figured out secrets handling and label the nodes properly.
Next up is creating a deb-node01.jenkins.osuosl.theforeman.org to replace the debian01 in Rackspace. I looked into this within OSUOSL but I could not find an Ubuntu 18.04 to match the current debian01. @mmoll what approach should we take here?
Last night theforeman.org was transferred from Godaddy to Gandi. It looks like various DNS recursors have cached the old NS records (TTL is 1 day in the .org zone) and Godaddy stopped responding to DNS requests. This is causing some instability. I expect this to stabilize during the day as TTLs expire.
I also took the opportunity to create DNS records for all hosts we manage.
Looks like katello vcr tests are now failing with:
An HTTP request has been made that VCR does not know how to handle:
POST https://node03.jenkins.osuosl.theforeman.org/pulp/api/v3/distributions/container/container/
I have ran puppet agent on it and hooked it in. I also updated the labels. @mmoll could you take a look and see if its setup correct and will handle what debian01 does today?
None of our tests are hardcoded to expect a hostname, VCR generates
requests with the current hostname in them, but does not consider that
when looking for a matching cassette. Its likely unrelated to the
hostname change. Will look into it.
I’ve updated the action items to where I think we are at. Next big steps are configuring all of the OSUOSL jobs to handle the types of work slave02 did, testing the new Debian node and then migrating the 3 big servers: Jenkins, webserver and Foreman/Puppet.
Can we wait with the jenkins and webserver migration until after 2.0.0 is released please (hopefuly later today)? the release has already been delayed quite a bit by various issues, I don’t want to have to delay it further because we aren’t able to release while servers are being migrated.
Additionally, I think we should set up some monitoring for how many slots are actually being used by Jenkins. We might be assigning too many or too little resources to it and it would be good to plan according to the capacity that we actually need. If we could get some insights regarding to “special” node (e.g. debian, arm, ssh…) usage, it might also help to see where are our bottlenecks.
Some operations (e.g. major release) might need additional resources. If it was easy to scale up or down, we might be able to use some short-lived instances for these cases without having idle machines that are wasting resources just so we have capacity for these peaks.