[infra] Reducing Rackspace usage

Dominic_Cleal · February 17, 2015, 9:15am

Hi all,

As some may know, we receive a very sizeable donation from Rackspace to
use their public cloud for our project infrastructure. It primarily
runs Jenkins, slaves, test VMs and our web site.

This last month we went 20% over budget, which they've kindly waived,
however we need to reduce our usage urgently.

TLDR: nightly packages (deb/RPM) will now be tested and pushed only once
a day (Mon-Fri) until we've reduced our usage. Can the Katello folks
who also use this account do the same immediately?

The usage breakdown is roughly:
Jenkins master and 9 slaves = 51.6%
Foreman (vagrant) test VMs = 15.5%
Katello (vagrant) test VMs = 22.2%
Foreman web site = 9.0%
Misc, one off VMs = 1.8%

We have had some donations of VMs from other people, some of which need
rebuilding, some of which are offline etc. I'll begin trying to get
these back into shape, which will let us take one or two Rackspace
slaves offline.

It's probably most useful for us to keep Rackspace primarily as a
flexible place to launch test VMs (i.e. using vagrant and BATS) and move
as many slaves to the other homes we've been offered, which will reduce
the bulk of the bill and let us increase testing again.

The web site is intriguing, it seems our bandwidth usage has
significantly increased to 1.54TB/month. We could look at having
mirrors for package content, though that adds some complexity and needs
work.

If you're reading this and would like to lend us a host or VM, I've
listed the general requirements for a Jenkins slave here:
Jenkins - Foreman,
please just drop me an e-mail.

Cheers,

···

-- Dominic Cleal Red Hat Engineering

ohadlevy · February 17, 2015, 12:09pm

> Hi all,
>
> As some may know, we receive a very sizeable donation from Rackspace to
> use their public cloud for our project infrastructure. It primarily
> runs Jenkins, slaves, test VMs and our web site.
>
> This last month we went 20% over budget, which they've kindly waived,
> however we need to reduce our usage urgently.
>
> TLDR: nightly packages (deb/RPM) will now be tested and pushed only once
> a day (Mon-Fri) until we've reduced our usage. Can the Katello folks
> who also use this account do the same immediately?
>
> The usage breakdown is roughly:
> Jenkins master and 9 slaves = 51.6%
> Foreman (vagrant) test VMs = 15.5%
> Katello (vagrant) test VMs = 22.2%
> Foreman web site = 9.0%
> Misc, one off VMs = 1.8%
>

thanks, very interesting.

>
> We have had some donations of VMs from other people, some of which need
> rebuilding, some of which are offline etc. I'll begin trying to get
> these back into shape, which will let us take one or two Rackspace
> slaves offline.
>
> It's probably most useful for us to keep Rackspace primarily as a
> flexible place to launch test VMs (i.e. using vagrant and BATS) and move
> as many slaves to the other homes we've been offered, which will reduce
> the bulk of the bill and let us increase testing again.
>
> The web site is intriguing, it seems our bandwidth usage has
> significantly increased to 1.54TB/month. We could look at having
> mirrors for package content, though that adds some complexity and needs
> work.
>

what is our bandwidth limit? is there also a performance issue on top?

···

On Tue, Feb 17, 2015 at 11:15 AM, Dominic Cleal wrote:

If you’re reading this and would like to lend us a host or VM, I’ve
listed the general requirements for a Jenkins slave here:

Jenkins - Foreman
,
please just drop me an e-mail.

Cheers,

–
Dominic Cleal
Red Hat Engineering

–
You received this message because you are subscribed to the Google Groups
“foreman-dev” group.
To unsubscribe from this group and stop receiving emails from it, send an
email to foreman-dev+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Dominic_Cleal · February 18, 2015, 2:54pm

Can somebody confirm this has been done please? It looks like multiple
jobs are running each day.

···

On 17/02/15 09:15, Dominic Cleal wrote: > TLDR: nightly packages (deb/RPM) will now be tested and pushed only once > a day (Mon-Fri) until we've reduced our usage. Can the Katello folks > who also use this account do the same immediately?

–
Dominic Cleal
Red Hat Engineering

Dominic_Cleal · February 23, 2015, 12:22pm

I've replaced two Rackspace slaves with three smaller VMs that we had
from Brian, these are now online. Two of them are running EL7, so this
required some yak shaving in our foreman-infra Puppet modules as they
were missing a lot of EL7 support, but it seems to be stable now.

There are a couple of other VMs I still need to revisit and get online,
which may let us replace one more Rackspace slave.

···

On 17/02/15 09:15, Dominic Cleal wrote: > We have had some donations of VMs from other people, some of which need > rebuilding, some of which are offline etc. I'll begin trying to get > these back into shape, which will let us take one or two Rackspace > slaves offline.

–
Dominic Cleal
Red Hat Engineering

Dominic_Cleal · February 17, 2015, 12:14pm

There's no transfer limit, but we simply "pay" per GB outgoing on public
networks (see the bottom of http://www.rackspace.com/cloud/servers/).
There's only a network/interface speed limit, which is huge.

I don't think there's a performance issue with the web server itself, it
appears to be under light load as it's only serving static files from
SSD storage. It's a very small virtual machine in actual fact.

A few people have mentioned the US to Europe data transfer speed from
our package repos is quite low, so this would also be a good reason to
begin considering mirrors.

···

On 17/02/15 12:09, Ohad Levy wrote: > > > On Tue, Feb 17, 2015 at 11:15 AM, Dominic Cleal > wrote: > It's probably most useful for us to keep Rackspace primarily as a > flexible place to launch test VMs (i.e. using vagrant and BATS) and move > as many slaves to the other homes we've been offered, which will reduce > the bulk of the bill and let us increase testing again. > > The web site is intriguing, it seems our bandwidth usage has > significantly increased to 1.54TB/month. We could look at having > mirrors for package content, though that adds some complexity and needs > work. > > > what is our bandwidth limit? is there also a performance issue on top?

–
Dominic Cleal
Red Hat Engineering

Shimon_Shtein · February 18, 2015, 7:32am

Not sure if it's a good idea, but I'll put it here anyway.

Can we consider Openshift Online for our website?
At least it should offload the bandwidth consumption from Rackspace.

Now that I'm thinking about it, Openshift has Jenkins gear too. It should
be possible to move at least the master machine to Openshift too.

···

On Tuesday, February 17, 2015 at 2:15:09 PM UTC+2, Dominic Cleal wrote: > > On 17/02/15 12:09, Ohad Levy wrote: > > > > > > On Tue, Feb 17, 2015 at 11:15 AM, Dominic Cleal > > <mailto:dcle...@redhat.com >> wrote: > > It's probably most useful for us to keep Rackspace primarily as a > > flexible place to launch test VMs (i.e. using vagrant and BATS) and > move > > as many slaves to the other homes we've been offered, which will > reduce > > the bulk of the bill and let us increase testing again. > > > > The web site is intriguing, it seems our bandwidth usage has > > significantly increased to 1.54TB/month. We could look at having > > mirrors for package content, though that adds some complexity and > needs > > work. > > > > > > what is our bandwidth limit? is there also a performance issue on top? > > There's no transfer limit, but we simply "pay" per GB outgoing on public > networks (see the bottom of http://www.rackspace.com/cloud/servers/). > There's only a network/interface speed limit, which is huge. > > I don't think there's a performance issue with the web server itself, it > appears to be under light load as it's only serving static files from > SSD storage. It's a very small virtual machine in actual fact. > > A few people have mentioned the US to Europe data transfer speed from > our package repos is quite low, so this would also be a good reason to > begin considering mirrors. > > -- > Dominic Cleal > Red Hat Engineering >

ehelms · February 18, 2015, 3:02pm

Updated the Katello release pipeline to run once a day. Note that installer
PRs run our systests as well so you may see some jobs coming from that.

Eric

···

On Wed, Feb 18, 2015 at 9:54 AM, Dominic Cleal wrote:

On 17/02/15 09:15, Dominic Cleal wrote:

TLDR: nightly packages (deb/RPM) will now be tested and pushed only once
a day (Mon-Fri) until we’ve reduced our usage. Can the Katello folks
who also use this account do the same immediately?

Can somebody confirm this has been done please? It looks like multiple
jobs are running each day.

–
Dominic Cleal
Red Hat Engineering

–
You received this message because you are subscribed to the Google Groups
“foreman-dev” group.
To unsubscribe from this group and stop receiving emails from it, send an
email to foreman-dev+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Justin_Sherrill · February 18, 2015, 3:04pm

I just updated our pipeline to only run once a day.

We still are running our bats tests for el6 & 7 on the katello-installer
pull requests. We might could see about moving these to more manual
testing in the meantime. Thoughts on this other katello devs?

-Justin

···

On 02/18/2015 09:54 AM, Dominic Cleal wrote: > On 17/02/15 09:15, Dominic Cleal wrote: >> TLDR: nightly packages (deb/RPM) will now be tested and pushed only once >> a day (Mon-Fri) until we've reduced our usage. Can the Katello folks >> who also use this account do the same immediately? > Can somebody confirm this has been done please? It looks like multiple > jobs are running each day. >

Justin_Sherrill · February 18, 2015, 3:05pm

Just to be sure, we did it twice!

···

On 02/18/2015 10:02 AM, Eric D Helms wrote: > Updated the Katello release pipeline to run once a day. Note that > installer PRs run our systests as well so you may see some jobs coming > from that. > > Eric > > On Wed, Feb 18, 2015 at 9:54 AM, Dominic Cleal > wrote: > > On 17/02/15 09:15, Dominic Cleal wrote: > > TLDR: nightly packages (deb/RPM) will now be tested and pushed > only once > > a day (Mon-Fri) until we've reduced our usage. Can the Katello folks > > who also use this account do the same immediately? > > Can somebody confirm this has been done please? It looks like > multiple > jobs are running each day. > > -- > Dominic Cleal > Red Hat Engineering > > -- > You received this message because you are subscribed to the Google > Groups "foreman-dev" group. > To unsubscribe from this group and stop receiving emails from it, > send an email to foreman-dev+unsubscribe@googlegroups.com > . > For more options, visit https://groups.google.com/d/optout. > > > -- > You received this message because you are subscribed to the Google > Groups "foreman-dev" group. > To unsubscribe from this group and stop receiving emails from it, send > an email to foreman-dev+unsubscribe@googlegroups.com > . > For more options, visit https://groups.google.com/d/optout.

Dominic_Cleal · February 18, 2015, 3:13pm

Thanks both!

···

On 18/02/15 15:05, Justin Sherrill wrote: > On 02/18/2015 10:02 AM, Eric D Helms wrote: >> Updated the Katello release pipeline to run once a day. Note that >> installer PRs run our systests as well so you may see some jobs coming >> from that. >> >> Eric >> >> On Wed, Feb 18, 2015 at 9:54 AM, Dominic Cleal > > wrote: >> >> On 17/02/15 09:15, Dominic Cleal wrote: >> > TLDR: nightly packages (deb/RPM) will now be tested and pushed >> only once >> > a day (Mon-Fri) until we've reduced our usage. Can the Katello >> folks >> > who also use this account do the same immediately? >> >> Can somebody confirm this has been done please? It looks like >> multiple >> jobs are running each day. > > Just to be sure, we did it twice!

–
Dominic Cleal
Red Hat Engineering

Dominic_Cleal · February 18, 2015, 8:17am

> Not sure if it's a good idea, but I'll put it here anyway.
>
> Can we consider Openshift Online for our website?
> At least it should offload the bandwidth consumption from Rackspace.

OpenShift doesn't have a gear suitable for basic file serving last time
I checked, it's targeted towards dynamic platforms. We'd also have to
put all of our packages and metadata into a git repo in order to push
and serve it, which isn't a very natural fit.

> Now that I'm thinking about it, Openshift has Jenkins gear too. It
> should be possible to move at least the master machine to Openshift too.

That might be possible, yep. I remember it used to launch slaves on
OpenShift, but presumably it can still connect to remote slaves?

On the subject of Jenkins though, Karanbir from CentOS called into
#theforeman-dev yesterday to talk about a new test environment that the
project's launching. I believe he's going to post to the list with more
details soon so we can discuss that idea.

···

On 18/02/15 07:32, Shim Shtein wrote:

–
Dominic Cleal
Red Hat Engineering

lzap · February 18, 2015, 12:03pm

> > Can we consider Openshift Online for our website?
> > At least it should offload the bandwidth consumption from Rackspace.
>
> OpenShift doesn't have a gear suitable for basic file serving last time
> I checked, it's targeted towards dynamic platforms. We'd also have to
> put all of our packages and metadata into a git repo in order to push
> and serve it, which isn't a very natural fit.

Just to make Dominic perfectly clear (I also struggled the very first
time when I read his words).

It's not our website (HTML + images), but our repositories which
generate the bandwidth If it was our website, the webserver is six
feet under already because of the load coming from the repositories.
We need a slashdot article to get there.

My guess is that 60% of the bandwidth is done by us, devs/qes. For
example I do 5 nightly installations daily, at least. I don't think it
is necessary to run against Rackspace paid instance for this. If we are
able to run just another mirror somewhere else and use that in our
tests, we might be able to bring the usage down without rolling out
mirrors for users. There is work needed to be done to do this (yum
mirror lists etc). But we could be able to offer at least installer
switch to select closest mirror possible.

Dom, where is the file serving instance located (geographically)?

···

-- Later, Lukas #lzap Zapletal

Dominic_Cleal · February 18, 2015, 12:17pm

>>> Can we consider Openshift Online for our website?
>>> At least it should offload the bandwidth consumption from Rackspace.
>>
>> OpenShift doesn't have a gear suitable for basic file serving last time
>> I checked, it's targeted towards dynamic platforms. We'd also have to
>> put all of our packages and metadata into a git repo in order to push
>> and serve it, which isn't a very natural fit.
>
> Just to make Dominic perfectly clear (I also struggled the very first
> time when I read his words).
>
> It's not our website (HTML + images), but our repositories which
> generate the bandwidth If it was our website, the webserver is six
> feet under already because of the load coming from the repositories.
> We need a slashdot article to get there.

Yep, sorry.

> My guess is that 60% of the bandwidth is done by us, devs/qes. For
> example I do 5 nightly installations daily, at least.

I reckon you're under-estimating our userbase

> I don't think it
> is necessary to run against Rackspace paid instance for this. If we are
> able to run just another mirror somewhere else and use that in our
> tests, we might be able to bring the usage down without rolling out
> mirrors for users. There is work needed to be done to do this (yum
> mirror lists etc). But we could be able to offer at least installer
> switch to select closest mirror possible.

Yeah, a mirror list would work for the yum repos and would be simpler
than a DNS solution.

> Dom, where is the file serving instance located (geographically)?

In Northern Virginia (IAD).

···

On 18/02/15 12:03, Lukas Zapletal wrote:

–
Dominic Cleal
Red Hat Engineering

ohadlevy · February 18, 2015, 12:20pm

> >>> Can we consider Openshift Online for our website?
> >>> At least it should offload the bandwidth consumption from Rackspace.
> >>
> >> OpenShift doesn't have a gear suitable for basic file serving last time
> >> I checked, it's targeted towards dynamic platforms. We'd also have to
> >> put all of our packages and metadata into a git repo in order to push
> >> and serve it, which isn't a very natural fit.
> >
> > Just to make Dominic perfectly clear (I also struggled the very first
> > time when I read his words).
> >
> > It's not our website (HTML + images), but our repositories which
> > generate the bandwidth If it was our website, the webserver is six
> > feet under already because of the load coming from the repositories.
> > We need a slashdot article to get there.
>
> Yep, sorry.
>
> > My guess is that 60% of the bandwidth is done by us, devs/qes. For
> > example I do 5 nightly installations daily, at least.
>
> do you happen to know if rackspace instances (e.g. other jenkins slaves
etc) costs money ? I would assume that internal network usage should not
cost / cost less?

···

On Wed, Feb 18, 2015 at 2:17 PM, Dominic Cleal wrote: > On 18/02/15 12:03, Lukas Zapletal wrote:

I reckon you’re under-estimating our userbase

I don’t think it
is necessary to run against Rackspace paid instance for this. If we are
able to run just another mirror somewhere else and use that in our
tests, we might be able to bring the usage down without rolling out
mirrors for users. There is work needed to be done to do this (yum
mirror lists etc). But we could be able to offer at least installer
switch to select closest mirror possible.

Yeah, a mirror list would work for the yum repos and would be simpler
than a DNS solution.

Dom, where is the file serving instance located (geographically)?

In Northern Virginia (IAD).

–
Dominic Cleal
Red Hat Engineering

–
You received this message because you are subscribed to the Google Groups
“foreman-dev” group.
To unsubscribe from this group and stop receiving emails from it, send an
email to foreman-dev+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

ehelms · February 18, 2015, 12:33pm

We server katello.org on Openshift as static assets. While there isn't a
pure static gear, the ruby gear allows using placing built assets into a
public folder that is then served. We avoid storing the statically built
site in git by using a Jenkins job that runs a build script, commits it to
git locally on the Jenkins and then pushes that to Openshift. The branch
with Openshift configuration and our build script can be seen at [1] and
the Jenkins job at [2].

[1] GitHub - Katello/katello.org at deploy
[2] http://ci.theforeman.org/job/deploy_katello_site/

Eric

···

On Wed, Feb 18, 2015 at 7:20 AM, Ohad Levy wrote:

On Wed, Feb 18, 2015 at 2:17 PM, Dominic Cleal dcleal+g@redhat.com > wrote:

On 18/02/15 12:03, Lukas Zapletal wrote:

Can we consider Openshift Online for our website?
At least it should offload the bandwidth consumption from Rackspace.

OpenShift doesn’t have a gear suitable for basic file serving last time
I checked, it’s targeted towards dynamic platforms. We’d also have to
put all of our packages and metadata into a git repo in order to push
and serve it, which isn’t a very natural fit.

Just to make Dominic perfectly clear (I also struggled the very first
time when I read his words).

It’s not our website (HTML + images), but our repositories which
generate the bandwidth If it was our website, the webserver is six
feet under already because of the load coming from the repositories.
We need a slashdot article to get there.

Yep, sorry.

My guess is that 60% of the bandwidth is done by us, devs/qes. For
example I do 5 nightly installations daily, at least.

do you happen to know if rackspace instances (e.g. other jenkins slaves
etc) costs money ? I would assume that internal network usage should not
cost / cost less?

I reckon you’re under-estimating our userbase

I don’t think it
is necessary to run against Rackspace paid instance for this. If we are
able to run just another mirror somewhere else and use that in our
tests, we might be able to bring the usage down without rolling out
mirrors for users. There is work needed to be done to do this (yum
mirror lists etc). But we could be able to offer at least installer
switch to select closest mirror possible.

Yeah, a mirror list would work for the yum repos and would be simpler
than a DNS solution.

Dom, where is the file serving instance located (geographically)?

In Northern Virginia (IAD).

–
Dominic Cleal
Red Hat Engineering

–
You received this message because you are subscribed to the Google Groups
“foreman-dev” group.
To unsubscribe from this group and stop receiving emails from it, send an
email to foreman-dev+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

–
You received this message because you are subscribed to the Google Groups
“foreman-dev” group.
To unsubscribe from this group and stop receiving emails from it, send an
email to foreman-dev+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Dominic_Cleal · February 18, 2015, 12:32pm

I believe traffic over the internal network (10.x) is free, but traffic
over the public interface is billed. We only use the public interface
addresses, nothing is configured to expect dual-homed hosts. We could
reconfigure the Jenkins slaves to use the internal IPs, should work fine.

That said, I don't think we have a lot of internal-only traffic.
Jenkins will use some, but bandwidth usage on the slaves was low
(6-18GB/month/each), and I expect the bulk is actually between the
slaves and external sites like GitHub/rubygems.org.

Our EL/Fedora test VMs actually communicate to our Koji instance (on
Amazon EC2) rather than our public yum repositories, as they're testing
packages before they're pushed. Our Debian/Ubuntu test VMs do though
communicate to the public web host, but if I'm reading the usage data
correctly, their bandwidth usage is also negligible.

Similar usage stats for bandwidth specifically:
Foreman test VMs = 0.03%
Katello test VMs = 0.04%
Jenkins slaves = 5.69%
Jenkins master = 3.33%
Foreman website = 90.9%

···

On 18/02/15 12:20, Ohad Levy wrote: > > > On Wed, Feb 18, 2015 at 2:17 PM, Dominic Cleal > wrote: > > On 18/02/15 12:03, Lukas Zapletal wrote: > >>> Can we consider Openshift Online for our website? > >>> At least it should offload the bandwidth consumption from Rackspace. > >> > >> OpenShift doesn't have a gear suitable for basic file serving last time > >> I checked, it's targeted towards dynamic platforms. We'd also have to > >> put all of our packages and metadata into a git repo in order to push > >> and serve it, which isn't a very natural fit. > > > > Just to make Dominic perfectly clear (I also struggled the very first > > time when I read his words). > > > > It's not our website (HTML + images), but our repositories which > > generate the bandwidth :-) If it was our website, the webserver is six > > feet under already because of the load coming from the repositories. > > We need a slashdot article to get there. :-) > > Yep, sorry. > > > My guess is that 60% of the bandwidth is done by us, devs/qes. For > > example I do 5 nightly installations daily, at least. > > do you happen to know if rackspace instances (e.g. other jenkins slaves > etc) costs money ? I would assume that internal network usage should not > cost / cost less?

–
Dominic Cleal
Red Hat Engineering