Infrastructure SIG Meeting Notes 1/20

Meeting notes from today infrastructure SIG meeting. Highlights:

  • Added prioritization list
  • Discussed new business around how to handle reduced capacity from ci.centos.org
  • Discussion of CentOS 8 support plan
    • need to solve capacity issues with ci.centos.org
    • need a target release for support
    • need Vagrant boxes for 8 stream to enable testing

Infrastructure SIG

Agenda

  • Introduction
  • State of Initiatives
  • New Business

Areas of Care

  • Underlying Infrastructure management
  • Jenkins
  • Jenkins Jobs
  • Redmine
  • Website and webservers
  • DNS
  • Foreman and puppetserver
  • Koji

Prioritization:

  • ci.centos.org limits with testing matrix
  • Fix Koji space issue
  • CentOS 8 Stream
  • Archiving old Debian releases
  • Auto-building Debian on PR merge
  • Netways Jenkins node migration
  • Rackspace migration of Jenkins
  • Rackspace migration of Foreman/puppetserver
  • Redmine migration
  • Rebuilding Koji
  • foreman-infra cleanup, ci/ directory
  • Use of Jenkinsfiles
  • New sponsor
  • CDN for website

Initiatives

Rackspace migration

  • Currently needs migration

    • Jenkins
    • Foreman/puppetserver
  • Jenkins

    • Owner: ewoud
    • New hostname: controller01.jenkins.osuosl.theforeman.org
    • Action Items:
      • Create new machine in OSUOSL with CentOS 7
      • Add it to Foreman
      • Assign the right Hostgroup
      • Take an outage window
        • Mark nodes as in maintenance mode in old Jenkins
        • Sync over /var/lib/jenkins
        • Take all but one node out of maintenance mode on old Jenkins
      • Turn on new Jenkins
        • Turn on one node
      • Run a test job
        • Run a nightly pipeline
      • Pick switchover date
        • Target Date: Sometime before Foreman 2.4 branching
        • Lower TTL day or two before target date
        • Update DNS
  • Foreman/puppetserver

    • Owner: ewoud
    • New hostname:
    • Action Items:
      • Split into two virtual machines
      • Manage Foreman with Puppet
        • Write up classes to manage Foreman
        • Put puppet in noop mode
        • Iterate until configuration looks sound, applies cleanly
        • Move puppet out of noop mode
      • Create new machine in OSUOSL with CentOS 7
      • Add new machine to the existing Foreman
        • apply puppet
      • Pick switchover date
        • Target date:
        • Lower TTL day or two before target date
      • Dump database on puppetmaster.theforeman.org
      • Copy files
        • Certificates
        • ??
      • Restore database on new machine
      • Update DNS

Redmine migration

Owner: ??

  • Run on Scaleways currently
    • Sponsoring ceased
    • Migrate to OSUOSL
      • Maybe conova?
  • Current Redmine version: 3.Y
  • Redmine Git Instance
  • Action Items
    • Build out migration plan
    • Test Redmine upgrade locally
    • Upgrade to EL8
      • RHEl 8? if RH gives clarity on open source project usage
      • CentOS 8 Stream otherwise
    • Upgrade Redmine to 4.Y

foreman-infra cleanup, ci/ directory

Owner: ehelms

Use of Jenkinsfiles

Owner: ewoud

  • Prerequisite: Convert all jobs to pipeline style
  • Giving projects control of building their own Jenkins jobs through a Jenkinsfile in the repository
  • Discussion
  • Will require moving to shared libraries instead of composed JJB
    • Still requires storing job definitions in JJB in foreman-infra
  • How to deal with secrets?
    • Does Jenkinsfile or Multi-branch PR have builtin for this?
  • Idea
    • Spin up a Jenkins server on OSUOSL and test the workflow

Archiving Old Debian Releases

Owner: evgeni

  • Discussion
  • Freight scans old archives on every run back to Foreman 1.2; increase speed of Debian builds
  • Proposal
    • Pick a date, and archive everything up to Foreman 2.0
    • Continue to expose the archives on an archive site
  • Action Items
    • Build archive site up to Foreman 2.0
    • Pick an archive date

New Sponsor

Owner: evgeni

  • Conova offered compute resource
  • VMWare based infrastructure, vCloud
    • Difference between vCloud vs vSphere
      • yes, there is
      • APIs have similar function, but are different and cannot attach Foreman to it
  • How could we make use of this infrastructure?
    • Could add more nodes and reduce slots on existing nodes
    • Could shift AWS nodes to this new infrastructure
  • Asked for 16 vCPU and 40 GB memory
    • Waiting on reply

Auto-building Debian on PR merge

Owner:

  • Need to automate the Debian release logic
  • Current jobs are hard to follow when they fail
  • Action Items
    • Step 1
      • Re-write the debian build jobs into pipelines that follow the RPM job pattern
    • Step 2
      • Enable auto-build on PR merge

CDN for the Website

Owner: evgeni

  • Need to fix RSS and CDN issue in order to server website via CDN
  • Pre-work completed
  • Action items
    • RSS statistics via CDN
      • Move RSS to a dedicated host
      • CDN log request independently
        • Amazon S3
        • SFTP with locked down user on the webserver

Rebuilding Koji

Owner:

  • Rebuilding Koji

  • Koji is a big ole machine

    • current Koji has server, builder, database all-in-one
    • requires a separate builder to handle EL8
    • is not managed by any config management
  • Server/hostnames:

  • Action Items

    • Build a new environment with config management, and then migrate into the new environment
    • Manage Koji through standard means in Foreman
    • Migrate to a new disk format
      • current disk format cannot grow beyond it’s current size
      • Steps
        • Create new disk
        • Migrate data to new disk
  • Koji running out of space

    • Action Items
      • Look for old OSes mrepo synced we can remove
        • Drop Fedora less than 29
      • Which OSes could we switch from local sync to using their CDN?
        • Fedora
        • EL7
      • Cleanup of old Foreman and Katello releases

CentOS Stream

  • How to handle build and release on CentOS Stream
    • Foreman is released against CentOS 8
    • Katello is not released against CentOS 8
  • Build
    • Use snapshotted stream repos or use bleeding edge?
  • Release
    • Need to target a Foreman release for CentOS 8 stream release
    • Release Katello 4.0 on CentOS 8?
  • Action Items
    • Add CentOS 8 stream to pipeline tests
    • Foreman 2.4 and Katello 4.0 will release on CentOS 8
    • Wait on migrating servers till more clarity with Stream

Netways Jenkins Node Migration

owner: evgeni/ewoud

  • Current node will be decomissioned ~couple of weeks
  • Hostname:
  • Action Items
    • Need to re-create Jenkins node in their Openstack environment
    • Delete old Jenkins node on their old infrastructure

ci.centos.org limits with testing matrix

  • Current
    • Jenkins node owned by ci.centos.org
      • Request bare metal machines from Duffy
      • Limited to 6/8 parallel machines from Duffy
      • Each OS - install,upgrade pairing requests a machine from Duffy to run a Vagrant pipeline on
    • We end up having 2 jobs rejected when a pipeline runs and it fails
  • Will need to scale to additional OSes:
    • Ubuntu 20.04
    • Debian 11
    • CentOS 8 Stream (would eventually replace CentOS 8)
  • Proposals
    1. Split our release pipelines similar to the nightly split
      • Schedule EL pipeline, if that succeeds schedule the Debian pipeline
    2. Reduce combinations that are run, only a single Debian
    3. Run all installs first, if they pass, run all upgrade jobs
    4. Is there other infrastructure we could explore using available to us?

Completed Items

  • Where to track infrastructure updates? [DONE]
    • Development discourse topic?
      • Sub-topic “Infrastructure”
  • Schedule Next Meeting [DONE]
  • Post Discourse tracking posts for each initiative [DONE]
    • Track updates

Documentation

Owners: ehelms, ewoud

  • Where to move and store documentation for infrastructure?
    • docs/ directory in foreman-infra written in markdown
      • Source that is outside of our infrastructure
    • auto-publish to github pages to publish docs
  • Action Item
    • Create docs/ directory [ehelms]
    • Migrate wiki pages from Redmine [ehelms]
    • Reviews

Webserver migration

  • Owner: Evgeni
  • web02 on Rackspace
  • New machine running in OSUOSL
    • Receives mirrors of yum content
    • Debian content mirroring in progress
  • Action Item
    • Final sync of content
      • Copy over Tomer’s homedir
    • Switchover
      • Target Date: 9/28 - EMEA morning
    • Shutdown web02
      • Target Date: 9/29
    • Destroy
      • Taget Date: 10/5

ARM Builders

Owner: evgeni

  • Two currently running on Scaleways
  • Community member raised sponsoring new ARM servers on AWS
    • Access controls a concern due to Debian push
  • ARM builds disabled as of 2.1
  • Action Item
    • Decide if keeping ARM
      • Proposal: Drop the ARM builds, announce that to discourse
        • Turn ARM machines off in Scaleway
        • Remove ARM machines from Scaleway

Moving to GH Actions from Travis for Puppet Modules

Owner: ewoud

Open ticket to OSUOSL about slow network connections

Owner: evgeni

  • File a ticket with details on network connection
  • Fixed itself

For anyone who’d like to watch/listen, the meeting was recorded here: