Hello,
Earlier today, at around 9:00 UTC, we noticed that Jenkins is not responding.
@mmoll logged into the server to restart it, noticed some suspicious processes are running and notified me.
After some investigation, it was discovered that around 8:00 UTC an attacker gained access to the jenkins user account on the server using a remote code execution exploit in one of our outdated Jenkins plugins. The attacker used the exploit to install a crypto-mining malware on the server. The malware led to resource exhaustion of the server causing the Jenkins outage.
By 9:29 UTC the malware has been terminated and continued monitoring indicated no further attack attempts.
At 11:45 UTC we had a sync-up meeting of the @infra team and decided to take precautionary action in case the attacker managed to cause any further damage or compromise secrets held on the server. We decided that we will decommission the existing server and stand up a new server for the Jenkins master with all plugins updated to the latest versions, and upgrade the underlying OS to CentOS 7 as well.
As a further precaution, we have revoked and updated various secrets held on the server.
We are currently in the process of bringing the worker nodes back up and restoring various previous configurations on the new Jenkins master. We hope that sometime tomorrow most of the previous functionality will be restored.
Due to the various plugin upgrades, some pipelines may now fail to work correctly, and we will be continuing to restore and fix the issues as we discover them. Expect some instability in the following days as we work out all of these issues.
If you have any further questions on this matter, feel free to reach out to me.
I would like to personally thank @mmoll, @ekohl and @evgeni for all of the hard work they’ve put into this effort today, and apologize to all developers whose workflow has been disrupted by this unplanned outage.