PXE provisioning troubles? [Foreman 1.8.1]

Greetings fellow Foreman (and Katello) users,

I'm just joining this party, and am impressed with the community built up
around Foreman/Katello! Thanks for contributing.

I've been experimenting with Foreman 1.8 as a replacement for our current
provisioning method (PXE+kickstart+custom config scripts written in perl)
in my environment.

I've been having trouble with Foreman fitting into the various existing
infrastructure that we do not run. (DNS/DHCP, and RHEL Satellite are
managed by our central campus-IT.) Being pretty new to the system, I've yet
to figure out if I should be sticking with Foreman or switch to Katello
(2.2), but that may be a different topic. I was wondering if I could get
some insight into the PXE provisioning problem I'm experiencing.

Here is what I see when I provision:

  1. Provision new test system 'testsystem' w/ a static IP -> provisions a
    new VM(VMware), get a new MAC, added into Foreman Hosts. PXELinux and
    provision (ks) Templates are generated and viewable in foreman(web) host
    view.
  2. Boot new system 'testsystem' -> boots PXE w/ Dynamic range
    IP(expected) assigned via DHCP and lands at the 'default' PXE menu with
    local and discovery options. There seems to be no handoff to the PXELinux
    template file seen in step 1(unexpected). I looked in
    /var/lib/tftpboot/pxelinux.cfg and only the 'default' menu exists.
  3. Observation: The system does not appear to be handed off to the
    expected PXELinux menu.
  4. Is that because the IP address does not match the static definition?
    2. Normally, I would think it would just generate a MAC address based
    PXE menu file so it would not matter what the initial IP address is.
    3. I'm not finding any log entries in foreman/foreman-proxy (besides
    the tftp connection in /var/log/messages), for the address my test system
    is coming in as. (with the dynamic, temporary IP)

Any ideas where I might be futzing up?

Maybe I have some slightly incorrect assumptions how the PXE session is
handled?

Any suggestions?

Additional Environment deets:

  • DNS: Run by central IT, primarily have to submit requests for new
    static/A records.
  • DHCP: Run by central IT, but we have configs in place for
    'next-server' for our networks. PXE works.***
  • *RHEL Satellite: *Run by central IT, RHEL Satellite native
    kickstart+provisioning is disabled. (because they don't use it)
  • Foreman: Run by me, RHEL6 x86_64 server, Network/IPAM - Internal DB,
    Boot mode Static. Two subnets, VMware compute, one domain. DNS and DHCP
    proxies turned off. TFTP-proxy turned on. Testing system on the same
    network as foreman server. Ports 69[udp], 80/443[tcp], 8140[tcp], 8443[tcp]
    all open on the firewall.

** *Each network has a dynamic range, on production, just enough to spin up
PXE+kickstart systems that switch to static IP's when handed off to the
kickstart file (anaconda/ks build phase). Development network is split,
half and half dynamic range (with names) and static IPs (without names).
Most of our development uses the dynamic range, with static names. Using
non-dynamic ranges requires a formal request to add a A record into DNS.

Hello,

> 2. Boot new system 'testsystem' -> boots PXE w/ Dynamic range
> IP(expected) assigned via DHCP and lands at the 'default' PXE menu with
> local and discovery options. There seems to be no handoff to the PXELinux
> template file seen in step 1(unexpected). I looked in
> /var/lib/tftpboot/pxelinux.cfg and only the 'default' menu exists.

when new host is created and Build mode is checked, if you assigned a
TFTP proxy properly for given subnet, there should be menu generated for
the particular MAC address with chainloading info how to boot the
installer.

> 3. Observation: The system does not appear to be handed off to the
> expected PXELinux menu.
> 1. Is that because the IP address does not match the static definition?
> 2. Normally, I would think it would just generate a MAC address based
> PXE menu file so it would not matter what the initial IP address is.
> 3. I'm not finding any log entries in foreman/foreman-proxy (besides
> the tftp connection in /var/log/messages), for the address my test system
> is coming in as. (with the dynamic, temporary IP)

When a host goes into build mode, we generate a token (UUID) and that
one is put into the pxelinux.cfg configuration file. It is being handed
over during provisioning (kickstart/preseed script). We identify hosts
using this token, IP does not matter (only if you turned it off).

Is your proxy communicating properly? Can you do Refresh features on the
Proxy list page? You should definitely see something in the proxy.log.

> Any ideas where I might be futzing up?

Make sure your subnet is correctly defined and linked with DHCP, TFTP
and DNS proxy.

> Any suggestions?

There is a WIP sequence diagrams that could help you understanding some
more bits:

https://github.com/theforeman/theforeman.org/pull/333/files

> Additional Environment deets:
>
> - DNS: Run by central IT, primarily have to submit requests for new
> static/A records.

If you don't link your subnet with DNS proxy, DNS orchestration is not
taking place. Everything works.

> - DHCP: Run by central IT, but we have configs in place for
> 'next-server' for our networks. PXE works.***

If tokens are enabled (default setting for couple of releases now), IP
addresses does not matter.

··· -- Later, Lukas #lzap Zapletal

Thank you Lukas, that diagram you referred me to was useful. It seems that
refreshing the foreman-proxy also helped, and turning on DEBUG provided a
lot more info than I was seeing. Our systems are now being built in an
unattended process, except for my RHEL systems.

It seems my one issue preventing RHEL builds now, is I'm missing my
$rhn_activation_key. I have the key, but unsure where to enter it, and the
'Provisioning Setup' is failing in the last stage (step4) mentioning this:

Oops, we're sorry but something went wrong
Warning!Validation failed: Group parameters.reference parameters require
an associated domain, operating system, host or host group

ActiveRecord::RecordInvalid

Validation failed: Group parameters.reference parameters require an
associated domain, operating system, host or host group
app/controllers/concerns/application_shared.rb:13:in set_timezone' app/models/concerns/foreman/thread_session.rb:32:inclear_thread'
lib/middleware/catch_json_parse_errors.rb:9:in `call'

I think I have everything setup, except for the RHN Satellite key, but
unsure what to do with it, with the provisioning setup script being unhappy.

I can open a bug report in the Foreman tracker if this appears to be a
larger issue than something I did on my lone system.

Suggestions?

··· On Tuesday, June 2, 2015 at 2:33:53 AM UTC-6, Lukas Zapletal wrote: > > > when new host is created and Build mode is checked, if you assigned a > TFTP proxy properly for given subnet, there should be menu generated for > the particular MAC address with chainloading info how to boot the > installer. > > When a host goes into build mode, we generate a token (UUID) and that > one is put into the pxelinux.cfg configuration file. It is being handed > over during provisioning (kickstart/preseed script). We identify hosts > using this token, IP does not matter (only if you turned it off). > > Is your proxy communicating properly? Can you do Refresh features on the > Proxy list page? You should definitely see something in the proxy.log. > > There is a WIP sequence diagrams that could help you understanding some > more bits: > > https://github.com/theforeman/theforeman.org/pull/333/files > > >

Hello,

> Validation failed: Group parameters.reference parameters require an
> associated domain, operating system, host or host group
> app/controllers/concerns/application_shared.rb:13:in set_timezone' > app/models/concerns/foreman/thread_session.rb:32:inclear_thread'
> lib/middleware/catch_json_parse_errors.rb:9:in `call'

I have reproduced this, please fill a bug for us.

In the meantime, select CentOS mirror in the provisioning setup and then
you can create your RHEL OS manually as well as Installation media and
other stuff.

··· -- Later, Lukas #lzap Zapletal

Thanks again Lukas.

I created a bug/ticket: Bug #10691: Foreman setup fails on step4_update with RHEL + Satellite - foreman_setup - Foreman

Something to note, even though the last step of the setup plugin failed, it
did create a RHEL6 OS and corresponding RHEL Install Media. It seems the
key(pun intended) missing ingredient is the activation_key. As far as I can
tell, this one thing is the last missing part as my test build of a RHEL
6.6 host was a success until it hit the rhn_register step in the
kickstart.(error in the logs.) Additionally I thought of using puppet to
just do the registration, but since it failed to register, prerequisite
software to puppet was missing in the minimal install and puppet would not
install until it was registered with RHN/Satellite due to dependency
issues. (chicken and the egg problem, I guess.)

If there was a DB entry or alternative way to enter the key, I think this
would be a solid workaround. Or possibly I can just edit the template and
hardcode the activation_key in there. I'll probably proceed with that step
today unless you or anyone else in the community has a better suggestion.

Thanks again for the assistance.

After digging into the 'redhat_register' snippet, I realized I can just use
parameters. :wink: I assigned them under 'Operating Systems' parameters. I
created the following:

activation_key = <ACTIVATION_KEY>
spacewalk_host = <RHN SATELLITE FQDN>
spacewalk_type = site

Fingers crossed, this seems to be working right now. (running some
additional tests)