Discovery image and slow network

Hello!

I'm using discovery image
"discovery_version": "3.0.5",
"discovery_release": "20151113.1",

I can't reboot the server with discovery image on it through Foreman web UI
(via autoprovision)
foreman says
> ProxyAPI::ProxyException: ERF12-1772 [ProxyAPI::ProxyException]: Unable
to perform power BMC operation ([Errno::ECONNREFUSED]: Connection refused -
connect(2)) for proxy http://10.68.21.78:8448/bmc

I've enabled ssh on the discovery image

when I ssh into it and check smart-proxy status

[root@fdi ~]# systemctl status foreman-proxy
foreman-proxy.service - Foreman Proxy
Loaded: loaded (/usr/lib/systemd/system/foreman-proxy.service; enabled)
Active: failed (Result: exit-code) since Tue 2016-05-17 14:40:19 UTC;
10min ago
Process: 1473 ExecStartPre=/usr/bin/generate-proxy-cert (code=exited,
status=1/FAILURE)

May 17 14:40:19 fdi generate-proxy-cert[1473]: Generating a 2048 bit RSA
private key
May 17 14:40:19 fdi generate-proxy-cert[1473]:
…+++
May 17 14:40:19 fdi generate-proxy-cert[1473]:
…+++
May 17 14:40:19 fdi generate-proxy-cert[1473]: writing new private key to
'/etc/foreman-proxy/key.pem'
May 17 14:40:19 fdi generate-proxy-cert[1473]: -----
May 17 14:40:19 fdi generate-proxy-cert[1473]: end of string encountered
while processing type of subject name element #0
May 17 14:40:19 fdi generate-proxy-cert[1473]: problems making Certificate
Request
May 17 14:40:19 fdi systemd[1]: foreman-proxy.service: control process
exited, code=exited status=1
May 17 14:40:19 fdi systemd[1]: Failed to start Foreman Proxy.
May 17 14:40:19 fdi systemd[1]: Unit foreman-proxy.service entered failed
state.

in /usr/bin/generate-proxy-cert I see
IP=$(nmcli -t -f IP4.ADDRESS con show primary 2>/dev/null | cut -f2 -d: |
cut -f1 -d/)

but the problem is that my network cards are kinda slow
and this generate-proxy-cer gets executed BEFORE my nic gets its IP via DHCP

How can I work around this problem?

discovery-debug is in http://pastebin.com/mVgPww0n

Hello,

> "discovery_version": "3.0.5",

> May 17 14:40:19 fdi generate-proxy-cert[1473]: end of string encountered
> while processing type of subject name element #0

please update to 3.1 image version which is compatible with 1.10 and
1.11 versions of foreman. This has been fixed already.

> but the problem is that my network cards are kinda slow
> and this generate-proxy-cer gets executed BEFORE my nic gets its IP via DHCP

Exactly.

> How can I work around this problem?

Update, the fix you need is this one:

https://github.com/theforeman/foreman-discovery-image/pull/50

The process of updating is simple:

http://theforeman.org/plugins/foreman_discovery/5.0/index.html#2.3.2Manualdownload

··· -- Later, Lukas #lzap Zapletal

Thanks, Lukas!

I've updated the discovery image but still no luck

[root@fdi ~]# facter productname

BladeCenter HS22 -[7870CTO]-

[root@fdi ~]# facter | egrep '(discovery_rel|discovery_ver)'

discovery_release => 20160428.1
discovery_version => 3.1.2

[root@fdi ~]# systemctl status foreman-proxy
● foreman-proxy.service - Foreman Proxy
Loaded: loaded (/usr/lib/systemd/system/foreman-proxy.service; enabled;
vendor preset: disabled)
Active: failed (Result: exit-code) since Thu 2016-05-19 09:30:39 UTC;
1min 52s ago
Process: 1508 ExecStartPre=/usr/bin/generate-proxy-cert (code=exited,
status=1/FAILURE)

May 19 09:30:39 fdi generate-proxy-cert[1508]: …+++
May 19 09:30:39 fdi generate-proxy-cert[1508]:
…+++
May 19 09:30:39 fdi generate-proxy-cert[1508]: writing new private key to
'/etc/foreman-proxy/key.pem'
May 19 09:30:39 fdi generate-proxy-cert[1508]: -----
May 19 09:30:39 fdi generate-proxy-cert[1508]: end of string encountered
while processing type of subject name element #0
May 19 09:30:39 fdi generate-proxy-cert[1508]: problems making Certificate
Request
May 19 09:30:39 fdi systemd[1]: foreman-proxy.service: control process
exited, code=exited status=1
May 19 09:30:39 fdi systemd[1]: Failed to start Foreman Proxy.
May 19 09:30:39 fdi systemd[1]: Unit foreman-proxy.service entered failed
state.
May 19 09:30:39 fdi systemd[1]: foreman-proxy.service failed.

full discovery-debug is in http://pastebin.com/uQNrxAUq

[root@fdi ~]# facter productname

System x3550 M2 -[7946w9w]-

[root@fdi ~]# facter | egrep '(discovery_rel|discovery_ver)'
discovery_release => 20160428.1
discovery_version => 3.1.2

[root@fdi ~]# systemctl status foreman-proxy
● foreman-proxy.service - Foreman Proxy
Loaded: loaded (/usr/lib/systemd/system/foreman-proxy.service; enabled;
vendor preset: disabled)
Active: failed (Result: exit-code) since Thu 2016-05-19 09:28:16 UTC;
18min ago
Process: 1389 ExecStartPre=/usr/bin/generate-proxy-cert (code=exited,
status=1/FAILURE)

May 19 09:28:16 fdi generate-proxy-cert[1389]:
…+++
May 19 09:28:16 fdi generate-proxy-cert[1389]:
…+++
May 19 09:28:16 fdi generate-proxy-cert[1389]: writing new private key to
'/etc/foreman-proxy/key.pem'
May 19 09:28:16 fdi generate-proxy-cert[1389]: -----
May 19 09:28:16 fdi generate-proxy-cert[1389]: end of string encountered
while processing type of subject name element #0
May 19 09:28:16 fdi generate-proxy-cert[1389]: problems making Certificate
Request
May 19 09:28:16 fdi systemd[1]: foreman-proxy.service: control process
exited, code=exited status=1
May 19 09:28:16 fdi systemd[1]: Failed to start Foreman Proxy.
May 19 09:28:16 fdi systemd[1]: Unit foreman-proxy.service entered failed
state.
May 19 09:28:16 fdi systemd[1]: foreman-proxy.service failed.

full discovery log is in http://pastebin.com/35VRwTgh

can't we use systemd magic and restart foreman-proxy always or on-failure
or something?

··· On Wednesday, 18 May 2016 11:14:07 UTC+3, Lukas Zapletal wrote: > > Hello, > > > "discovery_version": "3.0.5", > > > > May 17 14:40:19 fdi generate-proxy-cert[1473]: end of string encountered > > while processing type of subject name element #0 > > please update to 3.1 image version which is compatible with 1.10 and > 1.11 versions of foreman. This has been fixed already. > > > > > but the problem is that my network cards are kinda slow > > and this generate-proxy-cer gets executed BEFORE my nic gets its IP via > DHCP > > Exactly. > > > How can I work around this problem? > > Update, the fix you need is this one: > > https://github.com/theforeman/foreman-discovery-image/pull/50 > > The process of updating is simple: > > > http://theforeman.org/plugins/foreman_discovery/5.0/index.html#2.3.2Manualdownload > > -- > Later, > Lukas #lzap Zapletal >

Hey,

can you run the script manually:

> Process: 1508 ExecStartPre=/usr/bin/generate-proxy-cert (code=exited,
> status=1/FAILURE)

Does your network provide IPv4 address at all?

We need to have a valid (self-signed) https certificate for particular
primary interface. If there is none, we can't do much about it. We might
change to code and start the proxy anyway so it is accessible over HTTP,
that looks like an improvement, but won't help you much unless you
configure Foreman to use HTTP.

http://projects.theforeman.org/issues/15138

> full discovery-debug is in http://pastebin.com/uQNrxAUq

Something is wrong with your network:

DHCPDISCOVER on eno1 to 255.255.255.255 port 67 interval 18 (xid=0x116de311)
<warn> (eno1): DHCPv4 request timed out.
<info> (eno1): DHCPv4 state changed unknown -> timeout
<info> (eno1): canceled DHCP transaction, DHCP client pid 1489
<info> (eno1): DHCPv4 state changed timeout -> done
<info> (eno1): device state change: ip-config -> failed (reason 'ip-config-unavailable') [70 120 5]
<info> NetworkManager state is now DISCONNECTED
<info> startup complete
<warn> (eno1): Activation: failed for connection 'primary'
<info> (eno1): device state change: failed -> disconnected (reason 'none') [120 30 0]

··· -- Later, Lukas #lzap Zapletal

Hello!

can you run the script manually:
>
> > Process: 1508 ExecStartPre=/usr/bin/generate-proxy-cert (code=exited,
> > status=1/FAILURE)
>

yep. After the blue screen of discovery appears I can ssh into host and
exec that script without any problems

> Does your network provide IPv4 address at all?
>

yes, it does. In fact, I've got problems only with that IBM HS22
blade-servers.
The very same foreman setup works just fine with

> full discovery-debug is in http://pastebin.com/uQNrxAUq
>
> Something is wrong with your network:
>
>
it's not the network but the hardware.
I've tried
rd.net.timeout.dhcp=120 rd.net.dhcp.retry=10 rd.net.timeout.carrier=120
rd.net.timeout.iflink=120

but it seems they all are ignored
but https://github.com/dracutdevs/dracut/blob/RHEL-7/modules.d/40network/ifup.sh#L102
says it should be working

the last discovery-debug is in http://pastebin.com/WY5ZKqHa

> rd.net.timeout.dhcp=120 rd.net.dhcp.retry=10 rd.net.timeout.carrier=120
> rd.net.timeout.iflink=120

Well if you booted into discovery, Dracut is done with its work, so you
are trying to modify parameters of something that works just fine.

The problem here is DHCP client spawned by NetworkManager. It defaults
to 45 seconds for DHCP timeout, then it enters failed state and all
services depending on network continue booting.

Unfortunately, there is a RFE for RHEL 7.3 (in QA state) to backport
dhcp-timeout option. I will add this option to the next release of image
based on CentOS 7.3, but until then, I see no workaround:

https://bugzilla.redhat.com/show_bug.cgi?id=1262922

··· -- Later, Lukas #lzap Zapletal

>
>
> Unfortunately, there is a RFE for RHEL 7.3 (in QA state) to backport
> dhcp-timeout option. I will add this option to the next release of image
> based on CentOS 7.3, but until then, I see no workaround:
>
> https://bugzilla.redhat.com/show_bug.cgi?id=1262922

Unfortunately, I'm not a redhat employee and can't read that link
But I got the idea: I have to wait 'til 7.3 is out

Thanks for your time

Cheers!