Intermittent Jenkins failures today

Today we've had a few intermittent failures on our Jenkins environment
that I've seen and hopefully will have managed to fix shortly. (Not
intermittent errors in tests themselves.)

  1. One of the slaves ran out of space in ~/tmp, due to some very large
    files generated by npm that it hadn't cleaned up. I've added a cronjob
    (https://git.io/vwPsQ) to remove these.

  2. Thread creation errors started appearing with the release of Bundler
    1.12.0, which caused the Jenkins agent to crash, go offline, and try to
    resume. This then caused Xvfb related errors as it didn't get shutdown
    properly.

I've increased the nproc limit to cope better with the requirements of
Bundler (https://git.io/vwP5O), but I need to complete a restart of each
slave to pick this up. I'm starting this now, so expect a slowdown of
jobs in the queue over the next few hours as capacity will be lower.
Hopefully this will stop the errors once rolled out.

Please let me know if you still see similar issues still at the start of
next week.

··· -- Dominic Cleal dominic@cleal.org

I've now removed Xvfb from our test jobs as it's not been needed for a
while with integration tests using Poltergeist. This should prevent
errors seen on subsequent tests when the Xvfb shutdown fails.

··· On 29/04/16 16:28, Dominic Cleal wrote: > Today we've had a few intermittent failures on our Jenkins environment > that I've seen and hopefully will have managed to fix shortly. (Not > intermittent errors in tests themselves.) > > 1. One of the slaves ran out of space in ~/tmp, due to some very large > files generated by npm that it hadn't cleaned up. I've added a cronjob > (https://git.io/vwPsQ) to remove these. > > 2. Thread creation errors started appearing with the release of Bundler > 1.12.0, which caused the Jenkins agent to crash, go offline, and try to > resume. This then caused Xvfb related errors as it didn't get shutdown > properly.


Dominic Cleal
dominic@cleal.org