Problem:
Foreman trigger network interfaces update on puppet facts upload from oVirt 4.3 hypervisors (CentOS 7.7, puppet 5). oVirt uses linux bridges on top of vlan interfaces. It make foreman ruby processes eat all memory on our foreman server. I see a lot of audit logs about these changes (About 10 000 per hour) like following:
2020-03-10T15:26:19 [W|app|324427ce] Not queueing Nic::Managed: ["Identifier has already been taken"]
2020-03-10T15:26:19 [W|app|324427ce] Not queueing Nic::Managed: ["Identifier has already been taken"]
2020-03-10T15:26:20 [I|aud|324427ce] Nic::Managed (11273) update event on attached_to unknown, BBB
2020-03-10T15:26:21 [I|aud|324427ce] Nic::Managed (11282) update event on identifier unknown.36, BBB.36
2020-03-10T15:26:21 [I|aud|324427ce] Nic::Managed (11283) update event on attached_to unknown, BBB
2020-03-10T15:26:21 [I|aud|324427ce] Nic::Managed (11283) update event on identifier unknown.37, BBB.37
2020-03-10T15:26:21 [I|aud|324427ce] Nic::Managed (11284) update event on attached_to unknown, BBB
2020-03-10T15:26:21 [I|aud|324427ce] Nic::Managed (11284) update event on identifier unknown.38, BBB.38
2020-03-10T15:26:21 [I|aud|324427ce] Nic::Managed (11285) update event on attached_to unknown, BBB
2020-03-10T15:26:21 [I|aud|324427ce] Nic::Managed (11285) update event on identifier unknown.39, BBB.39
2020-03-10T15:26:22 [I|aud|324427ce] Nic::Managed (11286) update event on attached_to unknown, BBB
2020-03-10T15:26:22 [I|aud|324427ce] Nic::Managed (11286) update event on identifier unknown.41, BBB.41
2020-03-10T15:26:22 [I|aud|324427ce] Nic::Managed (11287) update event on attached_to unknown, BBB
2020-03-10T15:26:22 [I|aud|324427ce] Nic::Managed (11287) update event on identifier unknown.42, BBB.42
2020-03-10T15:26:22 [I|aud|324427ce] Nic::Managed (11288) update event on attached_to unknown, BBB
2020-03-10T15:26:22 [I|aud|324427ce] Nic::Managed (11288) update event on identifier unknown.48, BBB.48
2020-03-10T15:26:22 [I|aud|324427ce] Nic::Managed (11289) update event on attached_to unknown, BBB
2020-03-10T15:26:22 [I|aud|324427ce] Nic::Managed (11289) update event on identifier unknown.49, BBB.49
2020-03-10T15:26:22 [I|aud|324427ce] Nic::Managed (11290) update event on attached_to unknown, BBB
2020-03-10T15:26:22 [I|aud|324427ce] Nic::Managed (11290) update event on identifier unknown.50, BBB.50
2020-03-10T15:26:22 [I|aud|324427ce] Nic::Managed (11291) update event on attached_to unknown, BBB
2020-03-10T15:26:22 [I|aud|324427ce] Nic::Managed (11291) update event on identifier unknown.52, BBB.52
2020-03-10T15:26:22 [I|aud|324427ce] Nic::Managed (11292) update event on attached_to unknown, BBB
2020-03-10T15:26:22 [I|aud|324427ce] Nic::Managed (11292) update event on identifier unknown.53, BBB.53
2020-03-10T15:27:01 [W|app|324427ce] Not queueing Nic::Managed: ["IP address can't be blank"]
2020-03-10T15:27:01 [W|app|324427ce] Not queueing Nic::Managed: ["IP address can't be blank"]
2020-03-10T15:27:01 [W|app|324427ce] Not queueing Nic::Managed: ["IP address can't be blank"]
2020-03-10T15:27:01 [W|app|324427ce] Not queueing Nic::Managed: ["Identifier has already been taken"]
2020-03-10T15:27:01 [W|app|324427ce] Not queueing Nic::Managed: ["Identifier has already been taken"]
2020-03-10T15:27:01 [W|app|324427ce] Not queueing Nic::Managed: ["Identifier has already been taken"]
2020-03-10T15:27:02 [I|aud|324427ce] Nic::Managed (11301) update event on attached_to p1p1, JJJ
2020-03-10T15:27:02 [I|aud|324427ce] Nic::Managed (11301) update event on identifier p1p1.103, JJJ.103
2020-03-10T15:27:02 [I|aud|324427ce] Nic::Managed (11302) update event on attached_to p1p1, JJJ
2020-03-10T15:27:02 [I|aud|324427ce] Nic::Managed (11302) update event on identifier p1p1.111, JJJ.111
2020-03-10T15:27:02 [I|aud|324427ce] Nic::Managed (11303) update event on attached_to p1p1, JJJ
2020-03-10T15:27:05 [I|aud|324427ce] Nic::Managed (11324) update event on attached_to p1p2, unknown
2020-03-10T15:27:05 [I|aud|324427ce] Nic::Managed (11324) update event on identifier p1p2.31, unknown.31
2020-03-10T15:27:06 [I|aud|324427ce] Nic::Managed (11325) update event on attached_to p1p2, unknown
2020-03-10T15:27:06 [I|aud|324427ce] Nic::Managed (11325) update event on identifier p1p2.35, unknown.35
I found this unmerged PR from @lzap for the audit log lines but I think this is a symptom and not the root cause, foreman should not trigger network interfaces update at all because nothing changed between each puppet run (these updates happeneach time puppet post facts to foreman server). The server was provisioned via foreman.
Expected outcome:
Foreman does not trigger update for these network interfaces, and so, a lot less audit log.
Foreman and Proxy versions:
Foreman 1.24.2 / Katello 3.14.1
Foreman and Proxy plugin versions:
Distribution and version:
CentOS 7.7
Other relevant data:
Facter output look like this:
aio_agent_version => 5.5.18
augeas => {
version => "1.12.0"
}
[...]
facterversion => 3.11.11
[...]
networking => {
domain => "foo.bar.example.com",
fqdn => "hv1.foo.bar.example.com",
hostname => "hv1",
interfaces => {
;vdsmdummy; => {
mac => "ce:81:e2:7c:bf:3f",
mtu => 1500
},
BBB => {
mac => "01:1b:21:74:90:19",
mtu => 1496
},
GGG => {
mac => "01:1b:21:74:90:19",
mtu => 1496
},
App => {
mac => "01:1b:21:74:90:19",
mtu => 1496
},
LLL => {
mac => "01:1b:21:74:90:18",
mtu => 1496
},
WWW => {
mac => "01:1b:21:74:90:18",
mtu => 1496
},
br-int => {
mac => "3e:a1:3b:ad:6b:49",
mtu => 1500
},
em1 => {
mac => "62:2b:2b:57:f2:65",
mtu => 1496
},
em2 => {
mac => "62:2b:2b:57:f2:66",
mtu => 1500
},
lo => {
bindings => [
{
address => "127.0.0.1",
netmask => "255.0.0.0",
network => "127.0.0.0"
}
],
bindings6 => [
{
address => "::1",
netmask => "ffff:ffff:ffff:ffff:ffff:ffff:ffff:ffff",
network => "::1"
}
],
ip => "127.0.0.1",
ip6 => "::1",
mtu => 65536,
netmask => "255.0.0.0",
netmask6 => "ffff:ffff:ffff:ffff:ffff:ffff:ffff:ffff",
network => "127.0.0.0",
network6 => "::1",
scope6 => "host"
},
on33477c7eccb64 => {
mac => "01:1b:21:74:90:19",
mtu => 1496
},
on7e3ecefc631f4 => {
mac => "01:1b:21:74:90:19",
mtu => 1496
},
on9d34497362a84 => {
mac => "01:1b:21:74:90:19",
mtu => 1496
},
ond438dea2accc4 => {
mac => "01:1b:21:74:90:19",
mtu => 1496
},
ovirtmgmt => {
bindings => [
{
address => "192.168.1.100",
netmask => "255.255.255.0",
network => "192.168.1.0"
}
],
bindings6 => [
{
address => "fe80::862b:2bff:fe57:f265",
netmask => "ffff:ffff:ffff:ffff::",
network => "fe80::"
}
],
ip => "192.168.1.100",
ip6 => "fe80::862b:2bff:fe57:f265",
mac => "84:2b:2b:57:f2:65",
mtu => 1496,
netmask => "255.255.255.0",
netmask6 => "ffff:ffff:ffff:ffff::",
network => "192.168.1.0",
network6 => "fe80::",
scope6 => "link"
},
ovs-system => {
mac => "96:02:bb:36:6f:31",
mtu => 1500
},
p1p1 => {
mac => "01:1b:21:74:90:18",
mtu => 1496
},
p1p1.103 => {
mac => "01:1b:21:74:90:18",
mtu => 1496
},
p1p1.111 => {
mac => "01:1b:21:74:90:18",
mtu => 1496
},
p1p1.130 => {
mac => "00:1b:21:74:90:18",
mtu => 1496
},
p1p2 => {
mac => "01:1b:21:74:90:19",
mtu => 1496
},
p1p2.10 => {
mac => "01:1b:21:74:90:19",
mtu => 1496
},
p1p2.11 => {
mac => "01:1b:21:74:90:19",
mtu => 1496
},
p1p2.12 => {
mac => "01:1b:21:74:90:19",
mtu => 1496
},
p1p2.41 => {
mac => "00:1b:21:74:90:19",
mtu => 1496
},
p1p2.42 => {
mac => "00:1b:21:74:90:19",
mtu => 1496
},
vnet0 => {
bindings6 => [
{
address => "fe80::fc54:ff:fe4b:d1f4",
netmask => "ffff:ffff:ffff:ffff::",
network => "fe80::"
}
],
ip6 => "fe80::fc54:ff:fe4b:d1f4",
mac => "fe:54:00:4b:d1:f4",
mtu => 1496,
netmask6 => "ffff:ffff:ffff:ffff::",
network6 => "fe80::",
scope6 => "link"
},
vnet1 => {
bindings6 => [
{
address => "fe80::fc54:ff:fe77:2cb0",
netmask => "ffff:ffff:ffff:ffff::",
network => "fe80::"
}
],
ip6 => "fe80::fc54:ff:fe77:2cb0",
mac => "fe:54:00:77:2c:b0",
mtu => 1496,
netmask6 => "ffff:ffff:ffff:ffff::",
network6 => "fe80::",
scope6 => "link"
},
vnet2 => {
bindings6 => [
{
address => "fe80::fc54:ff:fe66:9856",
netmask => "ffff:ffff:ffff:ffff::",
network => "fe80::"
}
],
ip6 => "fe80::fc54:ff:fe66:9856",
mac => "fe:54:00:66:98:56",
mtu => 1496,
netmask6 => "ffff:ffff:ffff:ffff::",
network6 => "fe80::",
scope6 => "link"
},
vnet21 => {
bindings6 => [
{
address => "fe80::fc54:ff:fe79:d476",
netmask => "ffff:ffff:ffff:ffff::",
network => "fe80::"
}
],
ip6 => "fe80::fc54:ff:fe79:d476",
mac => "fe:54:00:79:d4:76",
mtu => 1496,
netmask6 => "ffff:ffff:ffff:ffff::",
network6 => "fe80::",
scope6 => "link"
},
vnet22 => {
bindings6 => [
{
address => "fe80::fc54:ff:feb6:57",
netmask => "ffff:ffff:ffff:ffff::",
network => "fe80::"
}
],
ip6 => "fe80::fc54:ff:feb6:57",
mac => "fe:54:00:b6:00:57",
mtu => 1496,
netmask6 => "ffff:ffff:ffff:ffff::",
network6 => "fe80::",
scope6 => "link"
},
vnet3 => {
bindings6 => [
{
address => "fe80::fc54:ff:fe6f:ea1d",
netmask => "ffff:ffff:ffff:ffff::",
network => "fe80::"
}
],
ip6 => "fe80::fc54:ff:fe6f:ea1d",
mac => "fe:54:00:6f:ea:1d",
mtu => 1496,
netmask6 => "ffff:ffff:ffff:ffff::",
network6 => "fe80::",
scope6 => "link"
},
[...]
}
os => {
architecture => "x86_64",
distro => {
codename => "Core",
description => "CentOS Linux release 7.7.1908 (Core)",
id => "CentOS",
release => {
full => "7.7.1908",
major => "7",
minor => "7"
},
specification => ":core-4.1-amd64:core-4.1-noarch"
},
family => "RedHat",
hardware => "x86_64",
name => "CentOS",
release => {
full => "7.7.1908",
major => "7",
minor => "7"
},
selinux => {
config_mode => "enforcing",
config_policy => "targeted",
current_mode => "enforcing",
enabled => true,
enforced => true,
policy_version => "31"
}
}
The call stack when interface update is (I added ‘raise Exception.new if self.auditable_type == “Nic::Managed”’ in app/models/concerns/audit_extensions.rb:164 to see it):
2020-03-12T08:35:44 [I|app|61664604] Processing by Api::V2::HostsController#facts as JSON
2020-03-12T08:35:44 [I|app|61664604] Parameters: {"facts"=>"[FILTERED]", "name"=>"hv1.foo.bar.example.com", "certname"=>"hv1.foo.bar.example.com", "apiv"=>"v2", "host"=>{"certname"=>"hv1.foo.bar.example.com", "name"=>"hv1.foo.bar.example.com"}}
2020-03-12T08:35:44 [W|app|61664604] Some subtrees exceeded 100 limit of facts, dropped 385 keys
2020-03-12T08:35:45 [I|app|a6f9a442] Rendering text template
2020-03-12T08:35:45 [I|app|a6f9a442] Rendered text template (0.0ms)
2020-03-12T08:35:45 [I|app|a6f9a442] Completed 200 OK in 388ms (Views: 0.7ms | ActiveRecord: 47.7ms)
2020-03-12T08:35:46 [I|app|61664604] Import facts for 'hv1.foo.bar.example.com' completed. Added: 0, Updated: 19, Deleted 0 facts
2020-03-12T08:35:46 [W|app|61664604] Not queueing Nic::Managed: ["Identifier has already been taken"]
2020-03-12T08:35:46 [W|app|61664604] Not queueing Nic::Managed: ["Identifier has already been taken"]
2020-03-12T08:35:46 [W|app|61664604] Not queueing Nic::Managed: ["Identifier has already been taken"]
2020-03-12T08:35:46 [W|app|61664604] Saving ;vdsmdummy; NIC for host hv1.foo.bar.example.com failed, skipping because:
2020-03-12T08:35:46 [W|app|61664604] Identifier has already been taken
2020-03-12T08:35:47 [I|app|61664604] Completed 500 Internal Server Error in 2237ms (ActiveRecord: 498.3ms)
2020-03-12T08:35:47 [F|app|61664604]
2020-03-12T08:35:47 [F|app|61664604] Exception (Exception):
2020-03-12T08:35:47 [F|app|61664604]
2020-03-12T08:35:47 [F|app|61664604] app/models/concerns/audit_extensions.rb:164:in `block (2 levels) in log_audit'
lib/foreman/logging.rb:77:in `block in with_fields'
lib/foreman/logging.rb:76:in `with_fields'
app/models/concerns/audit_extensions.rb:163:in `block in log_audit'
app/models/concerns/audit_extensions.rb:148:in `each_pair'
app/models/concerns/audit_extensions.rb:148:in `log_audit'
app/models/concerns/orchestration.rb:47:in `around_save_orchestration'
app/models/host/base.rb:566:in `block in update_virtuals'
app/models/host/base.rb:561:in `update_virtuals'
app/models/host/base.rb:542:in `set_interface'
app/models/host/base.rb:229:in `block in set_interfaces'
app/models/host/base.rb:226:in `each'
app/models/host/base.rb:226:in `set_interfaces'
app/models/host/base.rb:194:in `populate_fields_from_facts'
app/models/host/managed.rb:416:in `populate_fields_from_facts'
app/models/concerns/puppet_host_extensions.rb:3:in `populate_fields_from_facts'
app/models/host/base.rb:168:in `block in parse_facts'
app/services/foreman/telemetry_helper.rb:27:in `telemetry_duration_histogram'
app/models/host/base.rb:167:in `parse_facts'
app/models/host/base.rb:153:in `import_facts'
app/models/host/managed.rb:330:in `import_facts'
app/controllers/api/v2/hosts_controller.rb:308:in `facts'
app/controllers/api/v2/base_controller.rb:163:in `disable_json_root'
app/controllers/concerns/foreman/controller/timezone.rb:10:in `set_timezone'
app/models/concerns/foreman/thread_session.rb:32:in `clear_thread'
app/controllers/concerns/foreman/controller/topbar_sweeper.rb:12:in `set_topbar_sweeper_controller'
katello (3.14.1) lib/katello/middleware/event_daemon.rb:10:in `call'
lib/foreman/middleware/telemetry.rb:10:in `call'
lib/foreman/middleware/catch_json_parse_errors.rb:9:in `call'
lib/foreman/middleware/logging_context_session.rb:22:in `call'
lib/foreman/middleware/logging_context_request.rb:11:in `call'
katello (3.14.1) lib/katello/prevent_json_parsing.rb:12:in `call'