High Availability - best practices docs group

Hi all,

Over the past few months we've seen increasing interest in HA deployments,
especially in the IRC channel. We've also seen quite a few anecdotal
success stories, so it seems like we have some good hands-on experience in
the community now.

I'd like to see if we can collaborate on writing up something for the
community to use for HA setups. I know a few people are already interested
in contributing to this effort (hopefully they'll chime in :P). The goal
would be to distil the community experiences into something generic -
step-by-step notes for specific tasks, choices/considerations to make for
solving given questions (eg where to terminate SSL) and so on.

The two main questions in my head are:

  1. Where should this live? Should we add a new section to the manual? Or a
    "white papers" sort of thing? It feels like it could be a bit long-winded
    compared to some of the shorter scenarios in the manual, but a whole new
    section doesn't feel quite right either…

  2. Slightly driven by (1), but how should those interested collaborate? If
    we do it in the website repo, we could open a WIP PR and then allow
    multiple contributors on the PR (thanks for that recent feature GitHub!).
    Otherwise I guess an Etherpad or similar could work…

Thoughts?

··· -- Greg IRC: gwmngilfen

Great, I'm up to assist with whatever needed - test cases, documentation
and so on. I think we should have a branch or repo in the main project.
Easier to track things in one place then easier to push PRs if needed.

··· On Tuesday, 25 October 2016 13:07:48 UTC+1, Greg Sutcliffe wrote: > > Hi all, > > Over the past few months we've seen increasing interest in HA deployments, > especially in the IRC channel. We've also seen quite a few anecdotal > success stories, so it seems like we have some good hands-on experience in > the community now. > > I'd like to see if we can collaborate on writing up something for the > community to use for HA setups. I know a few people are already interested > in contributing to this effort (hopefully they'll chime in :P). The goal > would be to distil the community experiences into something generic - > step-by-step notes for specific tasks, choices/considerations to make for > solving given questions (eg where to terminate SSL) and so on. > > The two main questions in my head are: > > 1) Where should this live? Should we add a new section to the manual? Or a > "white papers" sort of thing? It feels like it could be a bit long-winded > compared to some of the shorter scenarios in the manual, but a whole new > section doesn't feel quite right either.... > > 2) Slightly driven by (1), but how should those interested collaborate? If > we do it in the website repo, we could open a WIP PR and then allow > multiple contributors on the PR (thanks for that recent feature GitHub!). > Otherwise I guess an Etherpad or similar could work... > > Thoughts? > -- > Greg > IRC: gwmngilfen >

I agree. I'm all for collaborating and helping out with this. I also think
in addition to using the repo and PRs, we should maybe all try and jump on
a video call and talk out our different experiences and come together as to
what we believe should be the best practices. I think it might be good as
well to publish a white paper along with adding documentation to the manual.

··· On Tuesday, October 25, 2016 at 8:49:12 AM UTC-4, Martin Dobrev wrote: > > Great, I'm up to assist with whatever needed - test cases, documentation > and so on. I think we should have a branch or repo in the main project. > Easier to track things in one place then easier to push PRs if needed. > > On Tuesday, 25 October 2016 13:07:48 UTC+1, Greg Sutcliffe wrote: >> >> Hi all, >> >> Over the past few months we've seen increasing interest in HA >> deployments, especially in the IRC channel. We've also seen quite a few >> anecdotal success stories, so it seems like we have some good hands-on >> experience in the community now. >> >> I'd like to see if we can collaborate on writing up something for the >> community to use for HA setups. I know a few people are already interested >> in contributing to this effort (hopefully they'll chime in :P). The goal >> would be to distil the community experiences into something generic - >> step-by-step notes for specific tasks, choices/considerations to make for >> solving given questions (eg where to terminate SSL) and so on. >> >> The two main questions in my head are: >> >> 1) Where should this live? Should we add a new section to the manual? Or >> a "white papers" sort of thing? It feels like it could be a bit long-winded >> compared to some of the shorter scenarios in the manual, but a whole new >> section doesn't feel quite right either.... >> >> 2) Slightly driven by (1), but how should those interested collaborate? >> If we do it in the website repo, we could open a WIP PR and then allow >> multiple contributors on the PR (thanks for that recent feature GitHub!). >> Otherwise I guess an Etherpad or similar could work... >> >> Thoughts? >> -- >> Greg >> IRC: gwmngilfen >> >

That sounds like a good idea - I can host that if needed (but you could
probably figure out a hangout yourselves :P). Would you want that public
(recorded, deep-dive style), or "private" (jn the sense that its open to
the community to join, but not recorded)?

Greg

··· On 25 October 2016 at 13:53, Christopher Pisano wrote:

we should maybe all try and jump on a video call and talk out our
different experiences and come together as to what we believe should be
the best practices.

Having re-read
https://github.com/blog/2247-improving-collaboration-with-forks and
realised it's for maintainers only, I think you're probably right. Once
we've decided where it should live, that can probably be arranged.

Greg

··· On 25 October 2016 at 13:49, Martin Dobrev wrote:

Great, I’m up to assist with whatever needed - test cases, documentation
and so on. I think we should have a branch or repo in the main project.
Easier to track things in one place then easier to push PRs if needed.

>
>
>> we should maybe all try and jump on a video call and talk out our
>> different experiences and come together as to what we believe should be
>> the best practices.
>>
>
> That sounds like a good idea - I can host that if needed (but you could
> probably figure out a hangout yourselves :P). Would you want that public
> (recorded, deep-dive style), or "private" (jn the sense that its open to
> the community to join, but not recorded)?
>
>
Initial talks can be private until we get some sort of draft on work
required. Then we might record as well so community knows what's going on.
Of course everyone's invited to participate and contribute to our talks.

··· On Tuesday, 25 October 2016 14:43:27 UTC+1, Greg Sutcliffe wrote: > On 25 October 2016 at 13:53, Christopher Pisano > wrote:

Greg

Honestly, the opsec considerations in either scenario are the same. I would
suggest recording it in case there are questions later.

This is my first shot at multiple-foreman
documentation: Foreman :: Manual
. It touches on a few things Chris P. mentioned in his blog post (reposted
on Foreman's site), and points out a few other issues that I ran in to. It
doesn't address HA proxy stuff, PG pool for the DB (if needed), setting up
a real (not self-signed) cert, etc.

··· On Tuesday, October 25, 2016 at 10:21:30 AM UTC-4, Martin Dobrev wrote: > > > > On Tuesday, 25 October 2016 14:43:27 UTC+1, Greg Sutcliffe wrote: >> >> On 25 October 2016 at 13:53, Christopher Pisano >> wrote: >> >>> we should maybe all try and jump on a video call and talk out our >>> different experiences and come together as to what we believe *should* be >>> the best practices. >>> >> >> That sounds like a good idea - I can host that if needed (but you could >> probably figure out a hangout yourselves :P). Would you want that public >> (recorded, deep-dive style), or "private" (jn the sense that its open to >> the community to join, but not recorded)? >> >> > Initial talks can be private until we get some sort of draft on work > required. Then we might record as well so community knows what's going on. > Of course everyone's invited to participate and contribute to our talks. > > >> Greg >> >

RedHat have published a reference architecture for Satellite 6.2 which
covers a lot of areas - it is available

··· On Tuesday, October 25, 2016 at 10:26:52 AM UTC-4, Chris Baldwin wrote: > > > > On Tuesday, October 25, 2016 at 10:21:30 AM UTC-4, Martin Dobrev wrote: >> >> >> >> On Tuesday, 25 October 2016 14:43:27 UTC+1, Greg Sutcliffe wrote: >>> >>> On 25 October 2016 at 13:53, Christopher Pisano >>> wrote: >>> >>>> we should maybe all try and jump on a video call and talk out our >>>> different experiences and come together as to what we believe *should* be >>>> the best practices. >>>> >>> >>> That sounds like a good idea - I can host that if needed (but you could >>> probably figure out a hangout yourselves :P). Would you want that public >>> (recorded, deep-dive style), or "private" (jn the sense that its open to >>> the community to join, but not recorded)? >>> >>> >> Initial talks can be private until we get some sort of draft on work >> required. Then we might record as well so community knows what's going on. >> Of course everyone's invited to participate and contribute to our talks. >> >> >>> Greg >>> >> > Honestly, the opsec considerations in either scenario are the same. I > would suggest recording it in case there are questions later. > > This is my first shot at multiple-foreman documentation: > https://theforeman.org/manuals/1.13/index.html#5.8MultipleForemaninstances > . It touches on a few things Chris P. mentioned in his blog post (reposted > on Foreman's site), and points out a few other issues that I ran in to. It > doesn't address HA proxy stuff, PG pool for the DB (if needed), setting up > a real (not self-signed) cert, etc. >

Hi,

let's schedule an initial meeting next Monday. We can come up with a time
on IRC but I'm eager to slowly start moving this project.

··· On Tuesday, 1 November 2016 01:49:04 UTC, Andrew Schofield wrote: > > RedHat have published a reference architecture for Satellite 6.2 which > covers a lot of areas - it is available > https://access.redhat.com/sites/default/files/attachments/sat6ha-lb-refarch.pdf > > On Tuesday, October 25, 2016 at 10:26:52 AM UTC-4, Chris Baldwin wrote: >> >> >> >> On Tuesday, October 25, 2016 at 10:21:30 AM UTC-4, Martin Dobrev wrote: >>> >>> >>> >>> On Tuesday, 25 October 2016 14:43:27 UTC+1, Greg Sutcliffe wrote: >>>> >>>> On 25 October 2016 at 13:53, Christopher Pisano >>>> wrote: >>>> >>>>> we should maybe all try and jump on a video call and talk out our >>>>> different experiences and come together as to what we believe *should* be >>>>> the best practices. >>>>> >>>> >>>> That sounds like a good idea - I can host that if needed (but you could >>>> probably figure out a hangout yourselves :P). Would you want that public >>>> (recorded, deep-dive style), or "private" (jn the sense that its open to >>>> the community to join, but not recorded)? >>>> >>>> >>> Initial talks can be private until we get some sort of draft on work >>> required. Then we might record as well so community knows what's going on. >>> Of course everyone's invited to participate and contribute to our talks. >>> >>> >>>> Greg >>>> >>> >> Honestly, the opsec considerations in either scenario are the same. I >> would suggest recording it in case there are questions later. >> >> This is my first shot at multiple-foreman documentation: >> https://theforeman.org/manuals/1.13/index.html#5.8MultipleForemaninstances >> . It touches on a few things Chris P. mentioned in his blog post (reposted >> on Foreman's site), and points out a few other issues that I ran in to. It >> doesn't address HA proxy stuff, PG pool for the DB (if needed), setting up >> a real (not self-signed) cert, etc. >> >

I may be a bit late to the party, but I'm interested in helping out with
this in any way I can.
I'm not running a HA setup atm, but I plan on going in that direction.
Maybe a fresh pair of eyes could be helpful?

So, last week mdobrev, discr33t, oogs, and timo (and me as secretary :P)
had a quick hangout to discuss a starting point for this. We took raw notes
at http://pad-katello.rhcloud.com/p/hadocs but here is my edited
highlights. Guys, correct me if I missed anything :wink:

Greg

··· On 11 November 2016 at 13:39, Martin Hovmöller wrote:

I may be a bit late to the party, but I’m interested in helping out with
this in any way I can.
I’m not running a HA setup atm, but I plan on going in that direction.
Maybe a fresh pair of eyes could be helpful?

Glad to have you :slight_smile:

HA Docs Kickoff - 11/10

On location of WIP docs:

  • Considerations were ease of collaboration, potential loss of edits and/or
    author history, and testing that it renders well.
  • Currently unsure if this should live in the Foreman manual eventually (as
    it may be quite large), probably a separate doc
  • For now, a separate repo can hold the WIP allowing easy collaboration.

Other points

  • Need to detail the concepts & choices the user will need to decide on (eg
    pass-through vs termination)

    • Important to detail the tradeoffs in each case
      • e.g. HA for scale is not the same as HA for availability
    • It’s ok to be opinionated, (e.g. the installer already is)
  • Pick one set of answers to those questions for the first cut, then expand
    after

    • Get a picture/workflow of how the doc will look
    • The Sat6 doc is very good, for one specific set of answers to those
      questions
  • Focus on open-source HA tools

    • users of bigger pieces of equipment probably know enough to work from
      there
  • Choice of initial platform matters a lot - eg HA katello would be much
    more complex

  • Probably makes sense to begin from a clean installer run

  • Get a list of the services that need to be tackled - foreman, proxy, db,
    dhcp, dns, even the cronjobs

    • Also things like passenger tuning
  • Some initial prior art in the blogs by discr33t and mdobrev

  • Long term plan to consider additional modules for the installer to handle
    this automatically

Actions:

[action] Greg to spin up a basic Jekyll site using the main theforeman.org
config for initial hacking
[discr33t] Dig up notes on the some of the pain points he hit previously
[oogs] Look into standalone SSL switchover docs (not directly connected but
useful)