Openstack/Foreman still tries to connect by SSH while launching a new instance without floating ip

Hi,

We just got our Openstack Havana cluster up and running. In Foreman 1.4 you
should be able to apply configuration via cloud-init (user-data).
This is all working but when you launch a machine it keeps waiting
'querying instance details for <hostname>'. This only happens when there is
no floating ip connected. All works fine when you add a floating ip address.

Another issue is that the lauched instance (with a floating ip) stays in
build mode. I don't know how to finish this. Just hitting cancel works but
that isn't very beautifull.

Anybody any ideas?

Here is my cloud-init template

#!/bin/bash
#kind: finish
#name: Ubuntu ec2 Finish

#oses:
#- Debian 6.0
#- Debian 7.0
#- Ubuntu 10.04
#- Ubuntu 12.04
#- Ubuntu 13.04

wget http://apt.puppetlabs.com/puppetlabs-release-stable.deb
dpkg -i puppetlabs-release-stable.deb
apt-get --yes --quiet update
apt-get --yes -o Dpkg::Options::="–force-confold" --quiet install facter puppet-common puppet libaugeas-ruby

mkdir -p /etc/puppet

cat > /etc/puppet/puppet.conf << EOF
#kind: snippet
#name: puppet.conf
[main]
vardir = /var/lib/puppet
logdir = /var/log/puppet
rundir = /var/run/puppet
ssldir = $vardir/ssl

[agent]
pluginsync = true
report = true
ignoreschedules = true
daemon = false
ca_server = foreman.naturalis.nl
certname = hh.openstacklocal
environment = production
server = foreman.naturalis.nl

EOF

rm /etc/hosts
cat > /etc/hosts << EOF
127.0.0.1 hh.openstacklocal hh localhost

The following lines are desirable for IPv6 capable hosts

::1 ip6-localhost ip6-loopback
fe00::0 ip6-localnet
ff00::0 ip6-mcastprefix
ff02::1 ip6-allnodes
ff02::2 ip6-allrouters
ff02::3 ip6-allhosts

EOF

/bin/cat /etc/hosts
/bin/hostname -f

/bin/sed -i 's/^START=no/START=yes/' /etc/default/puppet

/bin/touch /etc/puppet/namespaceauth.conf
/usr/bin/puppet agent --enable
sleep 60
/usr/bin/puppet agent --config /etc/puppet/puppet.conf -t --onetime --tags no_such_tag --server foreman.naturalis.nl --no-daemonize
#imported this line from bare metal prov. but doesn't seem to work.
/usr/bin/wget --quiet --output-document=/dev/null --no-check-certificate http://foreman.naturalis.nl:80/unattended/built

··· # /sbin/reboot

Are you on 1.4.0 or 1.4.1? The patch for not requring to be able to
reach the IP of the host only went into 1.4.1

Greg

··· On 11 March 2014 15:53, Atze de Vries wrote: > Hi, > > We just got our Openstack Havana cluster up and running. In Foreman 1.4 you > should be able to apply configuration via cloud-init (user-data). > This is all working but when you launch a machine it keeps waiting 'querying > instance details for '. This only happens when there is no > floating ip connected. All works fine when you add a floating ip address.

Hi Greg,

I am on 1.4.1 (we upgraded from 1.3)

··· Op dinsdag 11 maart 2014 18:28:42 UTC+1 schreef Greg Sutcliffe: > > On 11 March 2014 15:53, Atze de Vries <atze.d...@naturalis.nl> > wrote: > > Hi, > > > > We just got our Openstack Havana cluster up and running. In Foreman 1.4 > you > > should be able to apply configuration via cloud-init (user-data). > > This is all working but when you launch a machine it keeps waiting > 'querying > > instance details for '. This only happens when there is no > > floating ip connected. All works fine when you add a floating ip > address. > > Are you on 1.4.0 or 1.4.1? The patch for not requring to be able to > reach the IP of the host only went into 1.4.1 > > Greg >

The build state is updated by the wget at the end of the template, so
it sounds like that isn't being run (probably due to the IP issue, so
lets solve that first). Your template looks fine - did you mark the
image in Foreman as being user-data capable (it's a checkbox on the
Image edit/create screen)?

Greg

··· On 12 March 2014 09:37, Atze de Vries wrote: > Hi Greg, > > I am on 1.4.1 (we upgraded from 1.3)

Hi Greg,

Yes, the user-data checkbox is on. It also applies the cloud-init script.
There is something i forgot to mention in my. In the no floating ip
situation foreman tries to ssh in to the new instance at is local address.
This causes the timeout.

Atze

··· Op woensdag 12 maart 2014 12:39:15 UTC+1 schreef Greg Sutcliffe: > > On 12 March 2014 09:37, Atze de Vries <atze.d...@naturalis.nl> > wrote: > > Hi Greg, > > > > I am on 1.4.1 (we upgraded from 1.3) > > The build state is updated by the wget at the end of the template, so > it sounds like that isn't being run (probably due to the IP issue, so > lets solve that first). Your template looks fine - did you mark the > image in Foreman as being user-data capable (it's a checkbox on the > Image edit/create screen)? > > Greg >

The unattended url returns a 500:
This is during the cloud-init:
–2014-03-12 12:16:44–
http://foreman.naturalis.nl/unattended/provision?token=e1325740-021f-4780-b6fe-0b6a4c3ff3fa
Resolving foreman.naturalis.nl (foreman.naturalis.nl)… 134.213.30.6
Connecting to foreman.naturalis.nl
(foreman.naturalis.nl)|134.213.30.6|:80… connected. HTTP request sent,
awaiting response… 500 Internal Server Error 2014-03-12 12:16:46 ERROR
500: Internal Server Error

This is the logging from production.log:
Completed 201 Created in 2921.1ms (Views: 2.9ms | ActiveRecord: 0.0ms)
Started GET
"/unattended/provision?token=e1325740-021f-4780-b6fe-0b6a4c3ff3fa" for
132.229.34.105 at Wed Mar 12 12:16:45 +0000 2014
Processing by UnattendedController#provision as /
Parameters: {"token"=>"e1325740-021f-4780-b6fe-0b6a4c3ff3fa"}
Found bbb.openstacklocal
Remove puppet certificate for bbb.openstacklocal
Started GET "/hosts/bbb.openstacklocal/console" for 132.229.34.99 at Wed
Mar 12 12:16:45 +0000 2014
Processing by HostsController#console as HTML
Parameters: {"id"=>"bbb.openstacklocal"}
Adding autosign entry for bbb.openstacklocal
Rendered hosts/console/log.html.erb within layouts/application (3.2ms)
Rendered home/_user_dropdown.html.erb (1.8ms)
Read fragment views/tabs_and_title_records-1 0.2ms
Rendered home/_topbar.html.erb (3.0ms)
Rendered layouts/base.html.erb (4.7ms)
Completed 200 OK in 958.7ms (Views: 10.2ms | ActiveRecord: 0.8ms)
Operation FAILED: undefined method `path' for nil:NilClass
Completed 500 Internal Server Error in 1146.8ms

··· Op woensdag 12 maart 2014 12:56:04 UTC+1 schreef Atze de Vries: > > Hi Greg, > > Yes, the user-data checkbox is on. It also applies the cloud-init script. > There is something i forgot to mention in my. In the no floating ip > situation foreman tries to ssh in to the new instance at is local address. > This causes the timeout. > > Atze > > Op woensdag 12 maart 2014 12:39:15 UTC+1 schreef Greg Sutcliffe: >> >> On 12 March 2014 09:37, Atze de Vries wrote: >> > Hi Greg, >> > >> > I am on 1.4.1 (we upgraded from 1.3) >> >> The build state is updated by the wget at the end of the template, so >> it sounds like that isn't being run (probably due to the IP issue, so >> lets solve that first). Your template looks fine - did you mark the >> image in Foreman as being user-data capable (it's a checkbox on the >> Image edit/create screen)? >> >> Greg >> >

The production.log is a bit clusterd with requests to the console. Sorry
about that.

··· Op woensdag 12 maart 2014 13:31:55 UTC+1 schreef Atze de Vries: > > The unattended url returns a 500: > This is during the cloud-init: > --2014-03-12 12:16:44-- > http://foreman.naturalis.nl/unattended/provision?token=e1325740-021f-4780-b6fe-0b6a4c3ff3faResolving > foreman.naturalis.nl (foreman.naturalis.nl)... 134.213.30.6 Connecting to > foreman.naturalis.nl (foreman.naturalis.nl)|134.213.30.6|:80... > connected. HTTP request sent, awaiting response... 500 Internal Server > Error 2014-03-12 12:16:46 ERROR 500: Internal Server Error > > This is the logging from production.log: > Completed 201 Created in 2921.1ms (Views: 2.9ms | ActiveRecord: 0.0ms) > Started GET > "/unattended/provision?token=e1325740-021f-4780-b6fe-0b6a4c3ff3fa" for > 132.229.34.105 at Wed Mar 12 12:16:45 +0000 2014 > Processing by UnattendedController#provision as */* > Parameters: {"token"=>"e1325740-021f-4780-b6fe-0b6a4c3ff3fa"} > Found bbb.openstacklocal > Remove puppet certificate for bbb.openstacklocal > Started GET "/hosts/bbb.openstacklocal/console" for 132.229.34.99 at Wed > Mar 12 12:16:45 +0000 2014 > Processing by HostsController#console as HTML > Parameters: {"id"=>"bbb.openstacklocal"} > Adding autosign entry for bbb.openstacklocal > Rendered hosts/console/log.html.erb within layouts/application (3.2ms) > Rendered home/_user_dropdown.html.erb (1.8ms) > Read fragment views/tabs_and_title_records-1 0.2ms > Rendered home/_topbar.html.erb (3.0ms) > Rendered layouts/base.html.erb (4.7ms) > Completed 200 OK in 958.7ms (Views: 10.2ms | ActiveRecord: 0.8ms) > Operation FAILED: undefined method `path' for nil:NilClass > Completed 500 Internal Server Error in 1146.8ms > > > Op woensdag 12 maart 2014 12:56:04 UTC+1 schreef Atze de Vries: >> >> Hi Greg, >> >> Yes, the user-data checkbox is on. It also applies the cloud-init script. >> There is something i forgot to mention in my. In the no floating ip >> situation foreman tries to ssh in to the new instance at is local address. >> This causes the timeout. >> >> Atze >> >> Op woensdag 12 maart 2014 12:39:15 UTC+1 schreef Greg Sutcliffe: >>> >>> On 12 March 2014 09:37, Atze de Vries wrote: >>> > Hi Greg, >>> > >>> > I am on 1.4.1 (we upgraded from 1.3) >>> >>> The build state is updated by the wget at the end of the template, so >>> it sounds like that isn't being run (probably due to the IP issue, so >>> lets solve that first). Your template looks fine - did you mark the >>> image in Foreman as being user-data capable (it's a checkbox on the >>> Image edit/create screen)? >>> >>> Greg >>> >>

It will try to SSH, yes, but it shouldn't raise any errors when using
user-data. It's only testing to see if any of the IPs reported by
Openstack are reachable, to choose which one to save in the Foreman
DB. If none are reachable, it'll pick the first one and carry on.

As for the Internal Error on the wget, you'll need to enable debug
logging and send us the debug log of the /unattended/built request.
See Troubleshooting - Foreman

Greg

··· On 12 March 2014 11:56, Atze de Vries wrote: > Hi Greg, > > Yes, the user-data checkbox is on. It also applies the cloud-init script. > There is something i forgot to mention in my. In the no floating ip > situation foreman tries to ssh in to the new instance at is local address. > This causes the timeout.

Hi Greg,

About the wget error:
On the instance this link is used

http://foreman.naturalis.nl/unattended/provision?token=14646a4b-a8ad-4697-a8ae-aab3c0eb7900

In the render template (via foreman interface) this link should be provided:

http://foreman.naturalis.nl:80/unattended/built?token=14646a4b-a8ad-4697-a8ae-aab3c0eb7900

It seems to me that the correct url is not send to the instance.

··· Op woensdag 12 maart 2014 13:54:32 UTC+1 schreef Greg Sutcliffe: > > On 12 March 2014 11:56, Atze de Vries <atze.d...@naturalis.nl> > wrote: > > Hi Greg, > > > > Yes, the user-data checkbox is on. It also applies the cloud-init > script. > > There is something i forgot to mention in my. In the no floating ip > > situation foreman tries to ssh in to the new instance at is local > address. > > This causes the timeout. > > It will try to SSH, yes, but it shouldn't raise any errors when using > user-data. It's only testing to see if *any* of the IPs reported by > Openstack are reachable, to choose which one to save in the Foreman > DB. If none are reachable, it'll pick the first one and carry on. > > As for the Internal Error on the wget, you'll need to enable debug > logging and send us the debug log of the /unattended/built request. > See > http://projects.theforeman.org/projects/foreman/wiki/Troubleshooting#How-do-I-enable-debugging > > Greg >

Ah, are you using foreman_url with no options in your template? That's
a known bug with the user_data templates, try using
foreman_url('built') instead.

Greg

Ah cool, gonna try that.

About the ssh and time-out. This is when i lauch an instance without a
floating ip. Here is a tcpdump run on the foreman server and filter on the
local ip of the instance. It does this every 2 seconds for about 90 seconds
and then the foreman interface returns a timeout.
13:14:15.410814 IP foreman.naturalis.nl.35248 > 192.168.1.3.ssh: Flags [S],
seq 542766296, win 14600, options [mss 1460,sackOK,TS val 106475403 ecr
0,nop,wscale 6], length 0
13:14:16.409351 IP foreman.naturalis.nl.35248 > 192.168.1.3.ssh: Flags [S],
seq 542766296, win 14600, options [mss 1460,sackOK,TS val 106475653 ecr
0,nop,wscale 6], length 0
13:14:18.415345 IP foreman.naturalis.nl.35252 > 192.168.1.3.ssh: Flags [S],
seq 1068734316, win 14600, options [mss 1460,sackOK,TS val 106476154 ecr
0,nop,wscale 6], length 0
13:14:19.412238 IP foreman.naturalis.nl.35252 > 192.168.1.3.ssh: Flags [S],
seq 1068734316, win 14600, options [mss 1460,sackOK,TS val 106476404 ecr
0,nop,wscale 6], length 0

This is the full trace in the foreman interface
Timeout::Error
execution expired
app/models/concerns/orchestration/compute.rb:194:in find_address&#39; app/models/concerns/orchestration/compute.rb:99:insetComputeDetails'
app/models/concerns/orchestration/compute.rb:95:in each&#39; app/models/concerns/orchestration/compute.rb:95:insetComputeDetails'
app/models/concerns/orchestration.rb:148:in send&#39; app/models/concerns/orchestration.rb:148:inexecute'
app/models/concerns/orchestration.rb:88:in process&#39; app/models/concerns/orchestration.rb:80:ineach'
app/models/concerns/orchestration.rb:80:in process&#39; app/models/concerns/orchestration.rb:18:inon_save'
app/models/concerns/foreman/sti.rb:29:in save&#39; app/controllers/hosts_controller.rb:89:increate'
app/models/concerns/foreman/thread_session.rb:33:in clear_thread&#39; lib/middleware/catch_json_parse_errors.rb:9:incall'

··· Op woensdag 12 maart 2014 14:10:37 UTC+1 schreef Greg Sutcliffe: > > Ah, are you using foreman_url with no options in your template? That's > a known bug with the user_data templates, try using > foreman_url('built') instead. > > Greg >