Infrastructure SIG Meeting Notes 2021-04-01

Infrastructure SIG

Agenda

  • Introduction
  • State of Initiatives
    • Completed
  • New Business

Areas of Care

  • Underlying Infrastructure management
  • Jenkins
  • Jenkins Jobs
  • Redmine
  • Website and webservers
  • DNS
  • Foreman and puppetserver
  • Koji

Prioritization:

  • Moving katello repositories to yum.theforeman.org
  • ci.centos.org limits with testing matrix
  • New debian signing key
  • Fix Koji space issue
  • CentOS 8 Stream
  • Archiving old Debian releases
  • Auto-building Debian on PR merge
  • Netways Jenkins node migration
  • Rackspace migration of Jenkins
  • Rackspace migration of Foreman/puppetserver
  • Redmine migration
  • Rebuilding Koji
  • Use of Jenkinsfiles
  • New sponsor
  • CDN for website

Initiatives

Moving katello repositories to yum.theforeman.org

  • Katello repositories as of 4.0 and nightly are publishing to yum.theforeman.org
    • Katello 3.17 and 3.18 are still served from fedorapeople
  • Action Items
    • what do we want to do with the old releases on fedorapeople?
      • Leave them there
    • when do we want to stop publishing to fedorapeople?
      • Now for nightly [evgeni]
      • Remove nightly from fedorapeople [ehelms]
        • Try to keep the katello-repos RPM so users end up on the new repositories
    • Add deprecation note to fedorapeople [ehelms]

ci.centos.org limits with testing matrix

  • Current
    • Jenkins node owned by ci.centos.org
      • Request bare metal machines from Duffy
      • Limited to 6/8 parallel machines from Duffy
      • Each OS - install,upgrade pairing requests a machine from Duffy to run a Vagrant pipeline on
    • We end up having 2 jobs rejected when a pipeline runs and it fails
  • Will need to scale to additional OSes:
    • Ubuntu 20.04
    • Debian 11
    • CentOS 8 Stream (would eventually replace CentOS 8)
  • Proposals
    1. Split our release pipelines similar to the nightly split [blocker to adding 2+ new OSes]
      • Schedule EL pipeline, if that succeeds schedule the Debian pipeline
      • Ground work done to enable the split, next step is to split the jobs
        2. Reduce combinations that are run, only a single Debian
        3. Run all installs first, if they pass, run all upgrade jobs
    2. Is there other infrastructure we could explore using available to us?
      • Exploring what IBM cloud might could give us
        • Reasonably priced virt instances
        • Able to run vagrant on top of them
        • Performance comparable to ci.centos.org
      • Exploring Conova supplying HP boxes
    3. Look into throttling inside Jenkins
      6. https://plugins.jenkins.io/throttle-concurrents/#example-2-throttling-of-parallel-steps

Debian Signing Key needs extension

  • Expires at the end of March 2021
  • Last time:
    • Extended the expiration date
  • Action Items
    * Extend the key for 2.4
    • Build a plan to rotate to a new key for nightly+ (target the Foreman 3.0 release stream)
    • Document how to extend the key

Koji running out of space

CentOS Stream

  • How to handle build and release on CentOS Stream
    • Foreman is released against CentOS 8
    • Katello is not released against CentOS 8
  • Build
    • Use snapshotted stream repos or use bleeding edge?
  • Release
    • Need to target a Foreman release for CentOS 8 stream release
  • Now available as a base box in Forklift
  • Action Items
    • Run local pipeline tests to uncover any issues before adding to pipelines
    • Add CentOS 8 stream to pipeline tests
      • Given nightly pipelines are split across EL and Debian, we should not hit the ci.centos.org limits
    • Foreman 2.4 and Katello 4.0 will release on CentOS 8
    • Wait on migrating infrastructure servers till more clarity with Stream

Archiving Old Debian Releases

Owner: evgeni

  • Discussion
  • Freight scans old archives on every run back to Foreman 1.2; increase speed of Debian builds
  • Proposal
    • Pick a date, and archive everything up to Foreman 2.0
    • Continue to expose the archives on an archive site
    • Add to release procedure to archive N-5 version
  • Action Items
    ~~ * Build archive site up to Foreman 2.0
    * http://archivedeb.theforeman.org/
    • Pick an archive date
      • March 8th
      • All Foreman 1.X releases (~45 GB)~~
    • Add to release procedure to archive N-5 version

Auto-building Debian on PR merge

Owner:

  • Need to automate the Debian release logic
  • Current jobs are hard to follow when they fail
  • Action Items
    • Step 1
      • Re-write the debian build jobs into pipelines that follow the RPM job pattern
    • Step 2
      • Enable auto-build on PR merge

Rackspace migration

  • Currently needs migration

    • Jenkins
    • Foreman/puppetserver
  • Jenkins

    • Owner: ewoud
    • New hostname: controller01.jenkins.osuosl.theforeman.org
    • Action Items:
      • Create new machine in OSUOSL with CentOS 7
      • Add it to Foreman
      • Assign the right Hostgroup
      • Take an outage window
        • Mark nodes as in maintenance mode in old Jenkins
        • Sync over /var/lib/jenkins
        • Take all but one node out of maintenance mode on old Jenkins
      • Turn on new Jenkins
        • Turn on one node
      • Run a test job
        • Run a nightly pipeline
      • Pick switchover date
        • Target Date: Sometime before Foreman 2.5 branching
        • Lower TTL day or two before target date
        • Update DNS
  • Foreman/puppetserver

    • Owner: ewoud
    • New hostname:
    • Action Items:
      • Split into two virtual machines
      • Manage Foreman with Puppet
        • Write up classes to manage Foreman
        • Put puppet in noop mode
        • Iterate until configuration looks sound, applies cleanly
        • Move puppet out of noop mode
      • Create new machine in OSUOSL with CentOS 7
      • Add new machine to the existing Foreman
        • apply puppet
      • Pick switchover date
        • Target date:
        • Lower TTL day or two before target date
      • Dump database on puppetmaster.theforeman.org
      • Copy files
        • Certificates
        • ??
      • Restore database on new machine
      • Update DNS

Redmine migration

Owner: ??

  • Run on Scaleways currently
    • Sponsoring ceased
    • Migrate to OSUOSL
      • Maybe conova?
  • Current Redmine version: 3.Y
  • Redmine Git Instance
  • Action Items
    • Build out migration plan
    • Test Redmine upgrade locally
    • Upgrade to EL8
      • RHEl 8? if RH gives clarity on open source project usage
      • CentOS 8 Stream otherwise
    • Upgrade Redmine to 4.Y

Rebuilding Koji

Owner:

  • Rebuilding Koji
  • Koji is a big ole machine
    • current Koji has server, builder, database all-in-one
    • requires a separate builder to handle EL8
    • is not managed by any config management
  • Server/hostnames:
  • Action Items
    • Build a new environment with config management, and then migrate into the new environment
    • Manage Koji through standard means in Foreman
    • Migrate to a new disk format
      • current disk format cannot grow beyond it’s current size
      • Steps
        • Create new disk
        • Migrate data to new disk

Use of Jenkinsfiles

Owner: ewoud

  • Prerequisite: Convert all jobs to pipeline style
  • Giving projects control of building their own Jenkins jobs through a Jenkinsfile in the repository
  • Discussion
  • Will require moving to shared libraries instead of composed JJB
    • Still requires storing job definitions in JJB in foreman-infra
  • How to deal with secrets?
    • Does Jenkinsfile or Multi-branch PR have builtin for this?
  • Idea
    • Spin up a Jenkins server on OSUOSL and test the workflow

New Sponsor

Owner: evgeni

  • Conova offered compute resource
  • VMWare based infrastructure, vCloud
    • Difference between vCloud vs vSphere
      • yes, there is
      • APIs have similar function, but are different and cannot attach Foreman to it
  • How could we make use of this infrastructure?
    • Could add more nodes and reduce slots on existing nodes
    • Could shift AWS nodes to this new infrastructure
  • Asked for 16 vCPU and 40 GB memory
    • Waiting on reply

CDN for the Website

Owner: evgeni

  • Need to fix RSS and CDN issue in order to server website via CDN
  • Pre-work completed
  • Action items
    • RSS statistics via CDN
      • Move RSS to a dedicated host
      • CDN log request independently
        • Amazon S3
        • SFTP with locked down user on the webserver

RHEL for Open Source Infrastructure

  • 200 subscriptions by default, can be extended
  • Three parts:
    • Building in Koji against RHEL (are we allowed to host the RHEL repos under ROSI?)
      • Testing in CI for user support
    • Running Foreman infrastructure on RHEL
  • Long Term Goal:
    • Build on RHEL 8
    • Test against CentOS 8 Stream
    • Test on RHEL 8
    • Support implicitly RHEL clones

OSCI.io

Completed Items

  • Where to track infrastructure updates? [DONE]
    • Development discourse topic?
      • Sub-topic “Infrastructure”
  • Schedule Next Meeting [DONE]
  • Post Discourse tracking posts for each initiative [DONE]
    • Track updates

Documentation

Owners: ehelms, ewoud

  • Where to move and store documentation for infrastructure?
    • docs/ directory in foreman-infra written in markdown
      • Source that is outside of our infrastructure
    • auto-publish to github pages to publish docs
  • Action Item
    • Create docs/ directory [ehelms]
    • Migrate wiki pages from Redmine [ehelms]
    • Reviews

Webserver migration

  • Owner: Evgeni
  • web02 on Rackspace
  • New machine running in OSUOSL
    • Receives mirrors of yum content
    • Debian content mirroring in progress
  • Action Item
    • Final sync of content
      • Copy over Tomer’s homedir
    • Switchover
      • Target Date: 9/28 - EMEA morning
    • Shutdown web02
      • Target Date: 9/29
    • Destroy
      • Taget Date: 10/5

ARM Builders

Owner: evgeni

  • Two currently running on Scaleways
  • Community member raised sponsoring new ARM servers on AWS
    • Access controls a concern due to Debian push
  • ARM builds disabled as of 2.1
  • Action Item
    • Decide if keeping ARM
      • Proposal: Drop the ARM builds, announce that to discourse
        • Turn ARM machines off in Scaleway
        • Remove ARM machines from Scaleway

Moving to GH Actions from Travis for Puppet Modules

Owner: ewoud

Open ticket to OSUOSL about slow network connections

Owner: evgeni

  • File a ticket with details on network connection
  • Fixed itself

foreman-infra cleanup, ci/ directory

Owner: ehelms

Netways Jenkins Node Migration

owner: evgeni/ewoud

  • Current node will be decomissioned ~couple of weeks
  • Hostname:
  • Action Items
    • Need to re-create Jenkins node in their Openstack environment
    • Delete old Jenkins node on their old infrastructure

Watch or listen back here: