Massive externalNode performance degradation after upgrade to 1.16

Hello everybody,
after updating foreman from 1.14 to 1.15 to version 1.16, we have massive performance problems - all puppet nodes timed out.

About our environment:

  • 2x sles 12 sp2, each with 4 cpus and 8gb memory, virtualized with vmware
  • ruby: 2.1, passenger-5.0.26
  • installed foreman from source code
  • current puppet version: 3.8.5 -> we are going to upgrade to 5, but foreman 1.16 must be upgraded first
  • about 1200 puppet nodes

Before the upgrade the two foreman server had a CPU usage of 10-15%, one enc call took about 500ms. After the upgrade the two foreman server had a CPU usage of 90-100%, one enc call took about 10s.
We also tested foreman 1.16 with ruby2.3 and passenger-5.2.3 however, it did not bring any performance improvements
Is there any known issue about performance degradation?

If you need further data, please let me know.

kind regards

Hey,

we are not aware of specific perf degratations at the moment, but this is not the first report of such a problem with 1.16. Which endpoint is slow, is it /facts or /host/externalNode or both? Are you on PostgreSQL or MySQL? Let me do summary of fact importing changes:

Related fact changes in Foreman 1.14 release:

  • Fixes #9016 - improved fact loading performance

Related fact changes in Foreman 1.16 release:

  • Fixes #18561 - faster fact deletion on MySQL

Related fact changes in Foreman 1.17 release:

  • Fixes #20956 - Don’t pass class constants to AR conditions
  • Fixes #15409 - Separated puppet facts from core
  • Fixes #21776 - Improved fact importing to deal with names
  • Fixes #20929 - Added fact name filtering on import

Related fact changes in develop branch (Foreman 1.18):

  • Fixes #22796 - fact importing telemetry

If I had to guess, I’d try:

http://projects.theforeman.org/issues/18561

Otherwise provide more context.

Hey,
thank you for your fast reply!
At the moment, we have these only problems with the externalNode endpoint. We’re running Foreman with PostgreSQL 9.6.
Further, it seems that the requests are slower the more puppet classes (and thereby smart class parameters) are added to the host.

Looks like it’s not fact importing code this time, good. Folks, any ideas? The externalNode controller is rather unchanged for 1.16, it must be something deeper in models.

Can it be data stacking up? A lot of NICs, fact names, tasks, trends?
Any plugins? My second suspect would be some missing association hints. Worth looking at query logs to see what queries are performed and looking for groups of similar queries with only ID parameter changing.

Hey,
I dont think that the data stacking up. We can reproduce the behaviour on a “isolated” foreman host - no other request were handled.
We dont have a lot of NICs, fact names, taks and so on.
The following plugins are activated: “foreman-tasks”, “foreman_hooks” and “foreman_remote_execution”.

Below your can see a small part of the querylog of one enc call from the puppetserver - there are a lot of “SELECT lookup_keys” and “SELECT lookup_values”.

2018-05-03T07:49:19 2b02b61d [sql] [D]   CACHE (0.0ms)  SELECT "lookup_keys"."id" AS t0_r0, "lookup_keys"."key" AS t0_r1, "lookup_keys"."created_at" AS t0_r2, "lookup_keys"."updated_at" AS t0_r3, "lookup_keys"."puppetclass_id" AS t0_r4, "lookup_keys"."default_value" AS t0_r5, "lookup_keys"."path" AS t0_r6, "lookup_keys"."description" AS t0_r7, "lookup_keys"."validator_type" AS t0_r8, "lookup_keys"."validator_rule" AS t0_r9, "lookup_keys"."key_type" AS t0_r10, "lookup_keys"."override" AS t0_r11, "lookup_keys"."required" AS t0_r12, "lookup_keys"."merge_overrides" AS t0_r13, "lookup_keys"."avoid_duplicates" AS t0_r14, "lookup_keys"."omit" AS t0_r15, "lookup_keys"."type" AS t0_r16, "lookup_keys"."merge_default" AS t0_r17, "lookup_keys"."hidden_value" AS t0_r18, "environment_classes"."puppetclass_id" AS t1_r0, "environment_classes"."environment_id" AS t1_r1, "environment_classes"."id" AS t1_r2, "environment_classes"."puppetclass_lookup_key_id" AS t1_r3 FROM "lookup_keys" INNER JOIN "environment_classes" ON "environment_classes"."puppetclass_lookup_key_id" = "lookup_keys"."id" INNER JOIN "lookup_values" ON "lookup_values"."lookup_key_id" = "lookup_keys"."id" WHERE "lookup_keys"."type" IN ('PuppetclassLookupKey') AND "lookup_keys"."override" = $1 AND "environment_classes"."environment_id" = $2 AND "environment_classes"."puppetclass_id" IN (70, 191, 94, 347, 230, 10, 128, 772, 278, 50, 163, 636, 310, 131, 616, 116, 7)  ORDER BY lookup_keys.key  [["override", true], ["environment_id", 4]]
2018-05-03T07:49:19 2b02b61d [sql] [D]   CACHE (0.0ms)  SELECT "lookup_values".* FROM "lookup_values" WHERE "lookup_values"."match" IN ('fqdn=sonya-lawrence.server.de', 'hostgroup=', 'os=SLES 12.2', 'domain=server.de', 'environment=devel') AND "lookup_values"."lookup_key_id" IN (SELECT "lookup_keys"."id" FROM "lookup_keys" INNER JOIN "environment_classes" ON "environment_classes"."puppetclass_lookup_key_id" = "lookup_keys"."id" WHERE "lookup_keys"."type" IN ('PuppetclassLookupKey') AND "lookup_keys"."override" = $1 AND "environment_classes"."environment_id" = $2 AND "environment_classes"."puppetclass_id" IN (70, 191, 94, 347, 230, 10, 128, 772, 278, 50, 163, 636, 310, 131, 616, 116, 7)  ORDER BY lookup_keys.key)  [["override", true], ["environment_id", 4]]
2018-05-03T07:49:19 2b02b61d [sql] [D]   CACHE (0.0ms)  SELECT "lookup_keys".* FROM "lookup_keys" WHERE "lookup_keys"."id" IN (406)  ORDER BY lookup_keys.key
2018-05-03T07:49:19 2b02b61d [sql] [D]   CACHE (0.0ms)  SELECT "lookup_keys"."id" AS t0_r0, "lookup_keys"."key" AS t0_r1, "lookup_keys"."created_at" AS t0_r2, "lookup_keys"."updated_at" AS t0_r3, "lookup_keys"."puppetclass_id" AS t0_r4, "lookup_keys"."default_value" AS t0_r5, "lookup_keys"."path" AS t0_r6, "lookup_keys"."description" AS t0_r7, "lookup_keys"."validator_type" AS t0_r8, "lookup_keys"."validator_rule" AS t0_r9, "lookup_keys"."key_type" AS t0_r10, "lookup_keys"."override" AS t0_r11, "lookup_keys"."required" AS t0_r12, "lookup_keys"."merge_overrides" AS t0_r13, "lookup_keys"."avoid_duplicates" AS t0_r14, "lookup_keys"."omit" AS t0_r15, "lookup_keys"."type" AS t0_r16, "lookup_keys"."merge_default" AS t0_r17, "lookup_keys"."hidden_value" AS t0_r18, "environment_classes"."puppetclass_id" AS t1_r0, "environment_classes"."environment_id" AS t1_r1, "environment_classes"."id" AS t1_r2, "environment_classes"."puppetclass_lookup_key_id" AS t1_r3 FROM "lookup_keys" INNER JOIN "environment_classes" ON "environment_classes"."puppetclass_lookup_key_id" = "lookup_keys"."id" INNER JOIN "lookup_values" ON "lookup_values"."lookup_key_id" = "lookup_keys"."id" WHERE "lookup_keys"."type" IN ('PuppetclassLookupKey') AND "lookup_keys"."override" = $1 AND "environment_classes"."environment_id" = $2 AND "environment_classes"."puppetclass_id" IN (70, 191, 94, 347, 230, 10, 128, 772, 278, 50, 163, 636, 310, 131, 616, 116, 7)  ORDER BY lookup_keys.key  [["override", true], ["environment_id", 4]]
2018-05-03T07:49:19 2b02b61d [sql] [D]   CACHE (0.0ms)  SELECT "lookup_values".* FROM "lookup_values" WHERE "lookup_values"."match" IN ('fqdn=sonya-lawrence.server.de', 'hostgroup=', 'os=SLES 12.2', 'domain=server.de', 'environment=devel') AND "lookup_values"."lookup_key_id" IN (SELECT "lookup_keys"."id" FROM "lookup_keys" INNER JOIN "environment_classes" ON "environment_classes"."puppetclass_lookup_key_id" = "lookup_keys"."id" WHERE "lookup_keys"."type" IN ('PuppetclassLookupKey') AND "lookup_keys"."override" = $1 AND "environment_classes"."environment_id" = $2 AND "environment_classes"."puppetclass_id" IN (70, 191, 94, 347, 230, 10, 128, 772, 278, 50, 163, 636, 310, 131, 616, 116, 7)  ORDER BY lookup_keys.key)  [["override", true], ["environment_id", 4]]
2018-05-03T07:49:19 2b02b61d [sql] [D]   CACHE (0.0ms)  SELECT "lookup_keys".* FROM "lookup_keys" WHERE "lookup_keys"."id" IN (406)  ORDER BY lookup_keys.key
2018-05-03T07:49:19 2b02b61d [sql] [D]   CACHE (0.0ms)  SELECT "lookup_keys"."id" AS t0_r0, "lookup_keys"."key" AS t0_r1, "lookup_keys"."created_at" AS t0_r2, "lookup_keys"."updated_at" AS t0_r3, "lookup_keys"."puppetclass_id" AS t0_r4, "lookup_keys"."default_value" AS t0_r5, "lookup_keys"."path" AS t0_r6, "lookup_keys"."description" AS t0_r7, "lookup_keys"."validator_type" AS t0_r8, "lookup_keys"."validator_rule" AS t0_r9, "lookup_keys"."key_type" AS t0_r10, "lookup_keys"."override" AS t0_r11, "lookup_keys"."required" AS t0_r12, "lookup_keys"."merge_overrides" AS t0_r13, "lookup_keys"."avoid_duplicates" AS t0_r14, "lookup_keys"."omit" AS t0_r15, "lookup_keys"."type" AS t0_r16, "lookup_keys"."merge_default" AS t0_r17, "lookup_keys"."hidden_value" AS t0_r18, "environment_classes"."puppetclass_id" AS t1_r0, "environment_classes"."environment_id" AS t1_r1, "environment_classes"."id" AS t1_r2, "environment_classes"."puppetclass_lookup_key_id" AS t1_r3 FROM "lookup_keys" INNER JOIN "environment_classes" ON "environment_classes"."puppetclass_lookup_key_id" = "lookup_keys"."id" INNER JOIN "lookup_values" ON "lookup_values"."lookup_key_id" = "lookup_keys"."id" WHERE "lookup_keys"."type" IN ('PuppetclassLookupKey') AND "lookup_keys"."override" = $1 AND "environment_classes"."environment_id" = $2 AND "environment_classes"."puppetclass_id" IN (70, 191, 94, 347, 230, 10, 128, 772, 278, 50, 163, 636, 310, 131, 616, 116, 7)  ORDER BY lookup_keys.key  [["override", true], ["environment_id", 4]]
2018-05-03T07:49:19 2b02b61d [sql] [D]   CACHE (0.0ms)  SELECT "lookup_values".* FROM "lookup_values" WHERE "lookup_values"."match" IN ('fqdn=sonya-lawrence.server.de', 'hostgroup=', 'os=SLES 12.2', 'domain=server.de', 'environment=devel') AND "lookup_values"."lookup_key_id" IN (SELECT "lookup_keys"."id" FROM "lookup_keys" INNER JOIN "environment_classes" ON "environment_classes"."puppetclass_lookup_key_id" = "lookup_keys"."id" WHERE "lookup_keys"."type" IN ('PuppetclassLookupKey') AND "lookup_keys"."override" = $1 AND "environment_classes"."environment_id" = $2 AND "environment_classes"."puppetclass_id" IN (70, 191, 94, 347, 230, 10, 128, 772, 278, 50, 163, 636, 310, 131, 616, 116, 7)  ORDER BY lookup_keys.key)  [["override", true], ["environment_id", 4]]
2018-05-03T07:49:19 2b02b61d [sql] [D]   CACHE (0.0ms)  SELECT "lookup_keys".* FROM "lookup_keys" WHERE "lookup_keys"."id" IN (406)  ORDER BY lookup_keys.key
2018-05-03T07:49:19 2b02b61d [sql] [D]   CACHE (0.0ms)  SELECT "lookup_keys"."id" AS t0_r0, "lookup_keys"."key" AS t0_r1, "lookup_keys"."created_at" AS t0_r2, "lookup_keys"."updated_at" AS t0_r3, "lookup_keys"."puppetclass_id" AS t0_r4, "lookup_keys"."default_value" AS t0_r5, "lookup_keys"."path" AS t0_r6, "lookup_keys"."description" AS t0_r7, "lookup_keys"."validator_type" AS t0_r8, "lookup_keys"."validator_rule" AS t0_r9, "lookup_keys"."key_type" AS t0_r10, "lookup_keys"."override" AS t0_r11, "lookup_keys"."required" AS t0_r12, "lookup_keys"."merge_overrides" AS t0_r13, "lookup_keys"."avoid_duplicates" AS t0_r14, "lookup_keys"."omit" AS t0_r15, "lookup_keys"."type" AS t0_r16, "lookup_keys"."merge_default" AS t0_r17, "lookup_keys"."hidden_value" AS t0_r18, "environment_classes"."puppetclass_id" AS t1_r0, "environment_classes"."environment_id" AS t1_r1, "environment_classes"."id" AS t1_r2, "environment_classes"."puppetclass_lookup_key_id" AS t1_r3 FROM "lookup_keys" INNER JOIN "environment_classes" ON "environment_classes"."puppetclass_lookup_key_id" = "lookup_keys"."id" INNER JOIN "lookup_values" ON "lookup_values"."lookup_key_id" = "lookup_keys"."id" WHERE "lookup_keys"."type" IN ('PuppetclassLookupKey') AND "lookup_keys"."override" = $1 AND "environment_classes"."environment_id" = $2 AND "environment_classes"."puppetclass_id" IN (70, 191, 94, 347, 230, 10, 128, 772, 278, 50, 163, 636, 310, 131, 616, 116, 7)  ORDER BY lookup_keys.key  [["override", true], ["environment_id", 4]]
2018-05-03T07:49:19 2b02b61d [sql] [D]   CACHE (0.0ms)  SELECT "lookup_values".* FROM "lookup_values" WHERE "lookup_values"."match" IN ('fqdn=sonya-lawrence.server.de', 'hostgroup=', 'os=SLES 12.2', 'domain=server.de', 'environment=devel') AND "lookup_values"."lookup_key_id" IN (SELECT "lookup_keys"."id" FROM "lookup_keys" INNER JOIN "environment_classes" ON "environment_classes"."puppetclass_lookup_key_id" = "lookup_keys"."id" WHERE "lookup_keys"."type" IN ('PuppetclassLookupKey') AND "lookup_keys"."override" = $1 AND "environment_classes"."environment_id" = $2 AND "environment_classes"."puppetclass_id" IN (70, 191, 94, 347, 230, 10, 128, 772, 278, 50, 163, 636, 310, 131, 616, 116, 7)  ORDER BY lookup_keys.key)  [["override", true], ["environment_id", 4]]
2018-05-03T07:49:19 2b02b61d [sql] [D]   CACHE (0.0ms)  SELECT "lookup_keys".* FROM "lookup_keys" WHERE "lookup_keys"."id" IN (406)  ORDER BY lookup_keys.key
2018-05-03T07:49:19 2b02b61d [sql] [D]   CACHE (0.0ms)  SELECT "lookup_keys"."id" AS t0_r0, "lookup_keys"."key" AS t0_r1, "lookup_keys"."created_at" AS t0_r2, "lookup_keys"."updated_at" AS t0_r3, "lookup_keys"."puppetclass_id" AS t0_r4, "lookup_keys"."default_value" AS t0_r5, "lookup_keys"."path" AS t0_r6, "lookup_keys"."description" AS t0_r7, "lookup_keys"."validator_type" AS t0_r8, "lookup_keys"."validator_rule" AS t0_r9, "lookup_keys"."key_type" AS t0_r10, "lookup_keys"."override" AS t0_r11, "lookup_keys"."required" AS t0_r12, "lookup_keys"."merge_overrides" AS t0_r13, "lookup_keys"."avoid_duplicates" AS t0_r14, "lookup_keys"."omit" AS t0_r15, "lookup_keys"."type" AS t0_r16, "lookup_keys"."merge_default" AS t0_r17, "lookup_keys"."hidden_value" AS t0_r18, "environment_classes"."puppetclass_id" AS t1_r0, "environment_classes"."environment_id" AS t1_r1, "environment_classes"."id" AS t1_r2, "environment_classes"."puppetclass_lookup_key_id" AS t1_r3 FROM "lookup_keys" INNER JOIN "environment_classes" ON "environment_classes"."puppetclass_lookup_key_id" = "lookup_keys"."id" INNER JOIN "lookup_values" ON "lookup_values"."lookup_key_id" = "lookup_keys"."id" WHERE "lookup_keys"."type" IN ('PuppetclassLookupKey') AND "lookup_keys"."override" = $1 AND "environment_classes"."environment_id" = $2 AND "environment_classes"."puppetclass_id" IN (70, 191, 94, 347, 230, 10, 128, 772, 278, 50, 163, 636, 310, 131, 616, 116, 7)  ORDER BY lookup_keys.key  [["override", true], ["environment_id", 4]]
2018-05-03T07:49:19 2b02b61d [sql] [D]   CACHE (0.0ms)  SELECT "lookup_values".* FROM "lookup_values" WHERE "lookup_values"."match" IN ('fqdn=sonya-lawrence.server.de', 'hostgroup=', 'os=SLES 12.2', 'domain=server.de', 'environment=devel') AND "lookup_values"."lookup_key_id" IN (SELECT "lookup_keys"."id" FROM "lookup_keys" INNER JOIN "environment_classes" ON "environment_classes"."puppetclass_lookup_key_id" = "lookup_keys"."id" WHERE "lookup_keys"."type" IN ('PuppetclassLookupKey') AND "lookup_keys"."override" = $1 AND "environment_classes"."environment_id" = $2 AND "environment_classes"."puppetclass_id" IN (70, 191, 94, 347, 230, 10, 128, 772, 278, 50, 163, 636, 310, 131, 616, 116, 7)  ORDER BY lookup_keys.key)  [["override", true], ["environment_id", 4]]
2018-05-03T07:49:19 2b02b61d [sql] [D]   CACHE (0.0ms)  SELECT "lookup_keys".* FROM "lookup_keys" WHERE "lookup_keys"."id" IN (406)  ORDER BY lookup_keys.key
2018-05-03T07:49:19 2b02b61d [sql] [D]   CACHE (0.0ms)  SELECT "lookup_keys"."id" AS t0_r0, "lookup_keys"."key" AS t0_r1, "lookup_keys"."created_at" AS t0_r2, "lookup_keys"."updated_at" AS t0_r3, "lookup_keys"."puppetclass_id" AS t0_r4, "lookup_keys"."default_value" AS t0_r5, "lookup_keys"."path" AS t0_r6, "lookup_keys"."description" AS t0_r7, "lookup_keys"."validator_type" AS t0_r8, "lookup_keys"."validator_rule" AS t0_r9, "lookup_keys"."key_type" AS t0_r10, "lookup_keys"."override" AS t0_r11, "lookup_keys"."required" AS t0_r12, "lookup_keys"."merge_overrides" AS t0_r13, "lookup_keys"."avoid_duplicates" AS t0_r14, "lookup_keys"."omit" AS t0_r15, "lookup_keys"."type" AS t0_r16, "lookup_keys"."merge_default" AS t0_r17, "lookup_keys"."hidden_value" AS t0_r18, "environment_classes"."puppetclass_id" AS t1_r0, "environment_classes"."environment_id" AS t1_r1, "environment_classes"."id" AS t1_r2, "environment_classes"."puppetclass_lookup_key_id" AS t1_r3 FROM "lookup_keys" INNER JOIN "environment_classes" ON "environment_classes"."puppetclass_lookup_key_id" = "lookup_keys"."id" INNER JOIN "lookup_values" ON "lookup_values"."lookup_key_id" = "lookup_keys"."id" WHERE "lookup_keys"."type" IN ('PuppetclassLookupKey') AND "lookup_keys"."override" = $1 AND "environment_classes"."environment_id" = $2 AND "environment_classes"."puppetclass_id" IN (70, 191, 94, 347, 230, 10, 128, 772, 278, 50, 163, 636, 310, 131, 616, 116, 7)  ORDER BY lookup_keys.key  [["override", true], ["environment_id", 4]]
2018-05-03T07:49:19 2b02b61d [sql] [D]   CACHE (0.0ms)  SELECT "lookup_values".* FROM "lookup_values" WHERE "lookup_values"."match" IN ('fqdn=sonya-lawrence.server.de', 'hostgroup=', 'os=SLES 12.2', 'domain=server.de', 'environment=devel') AND "lookup_values"."lookup_key_id" IN (SELECT "lookup_keys"."id" FROM "lookup_keys" INNER JOIN "environment_classes" ON "environment_classes"."puppetclass_lookup_key_id" = "lookup_keys"."id" WHERE "lookup_keys"."type" IN ('PuppetclassLookupKey') AND "lookup_keys"."override" = $1 AND "environment_classes"."environment_id" = $2 AND "environment_classes"."puppetclass_id" IN (70, 191, 94, 347, 230, 10, 128, 772, 278, 50, 163, 636, 310, 131, 616, 116, 7)  ORDER BY lookup_keys.key)  [["override", true], ["environment_id", 4]]
2018-05-03T07:49:19 2b02b61d [sql] [D]   CACHE (0.0ms)  SELECT "lookup_keys".* FROM "lookup_keys" WHERE "lookup_keys"."id" IN (406)  ORDER BY lookup_keys.key
2018-05-03T07:49:19 2b02b61d [sql] [D]   CACHE (0.0ms)  SELECT "lookup_keys"."id" AS t0_r0, "lookup_keys"."key" AS t0_r1, "lookup_keys"."created_at" AS t0_r2, "lookup_keys"."updated_at" AS t0_r3, "lookup_keys"."puppetclass_id" AS t0_r4, "lookup_keys"."default_value" AS t0_r5, "lookup_keys"."path" AS t0_r6, "lookup_keys"."description" AS t0_r7, "lookup_keys"."validator_type" AS t0_r8, "lookup_keys"."validator_rule" AS t0_r9, "lookup_keys"."key_type" AS t0_r10, "lookup_keys"."override" AS t0_r11, "lookup_keys"."required" AS t0_r12, "lookup_keys"."merge_overrides" AS t0_r13, "lookup_keys"."avoid_duplicates" AS t0_r14, "lookup_keys"."omit" AS t0_r15, "lookup_keys"."type" AS t0_r16, "lookup_keys"."merge_default" AS t0_r17, "lookup_keys"."hidden_value" AS t0_r18, "environment_classes"."puppetclass_id" AS t1_r0, "environment_classes"."environment_id" AS t1_r1, "environment_classes"."id" AS t1_r2, "environment_classes"."puppetclass_lookup_key_id" AS t1_r3 FROM "lookup_keys" INNER JOIN "environment_classes" ON "environment_classes"."puppetclass_lookup_key_id" = "lookup_keys"."id" INNER JOIN "lookup_values" ON "lookup_values"."lookup_key_id" = "lookup_keys"."id" WHERE "lookup_keys"."type" IN ('PuppetclassLookupKey') AND "lookup_keys"."override" = $1 AND "environment_classes"."environment_id" = $2 AND "environment_classes"."puppetclass_id" IN (70, 191, 94, 347, 230, 10, 128, 772, 278, 50, 163, 636, 310, 131, 616, 116, 7)  ORDER BY lookup_keys.key  [["override", true], ["environment_id", 4]]
2018-05-03T07:49:19 2b02b61d [sql] [D]   CACHE (0.0ms)  SELECT "lookup_values".* FROM "lookup_values" WHERE "lookup_values"."match" IN ('fqdn=sonya-lawrence.server.de', 'hostgroup=', 'os=SLES 12.2', 'domain=server.de', 'environment=devel') AND "lookup_values"."lookup_key_id" IN (SELECT "lookup_keys"."id" FROM "lookup_keys" INNER JOIN "environment_classes" ON "environment_classes"."puppetclass_lookup_key_id" = "lookup_keys"."id" WHERE "lookup_keys"."type" IN ('PuppetclassLookupKey') AND "lookup_keys"."override" = $1 AND "environment_classes"."environment_id" = $2 AND "environment_classes"."puppetclass_id" IN (70, 191, 94, 347, 230, 10, 128, 772, 278, 50, 163, 636, 310, 131, 616, 116, 7)  ORDER BY lookup_keys.key)  [["override", true], ["environment_id", 4]]
2018-05-03T07:49:19 2b02b61d [sql] [D]   CACHE (0.0ms)  SELECT "lookup_keys".* FROM "lookup_keys" WHERE "lookup_keys"."id" IN (406)  ORDER BY lookup_keys.key
2018-05-03T07:49:19 2b02b61d [sql] [D]   CACHE (0.0ms)  SELECT "lookup_keys"."id" AS t0_r0, "lookup_keys"."key" AS t0_r1, "lookup_keys"."created_at" AS t0_r2, "lookup_keys"."updated_at" AS t0_r3, "lookup_keys"."puppetclass_id" AS t0_r4, "lookup_keys"."default_value" AS t0_r5, "lookup_keys"."path" AS t0_r6, "lookup_keys"."description" AS t0_r7, "lookup_keys"."validator_type" AS t0_r8, "lookup_keys"."validator_rule" AS t0_r9, "lookup_keys"."key_type" AS t0_r10, "lookup_keys"."override" AS t0_r11, "lookup_keys"."required" AS t0_r12, "lookup_keys"."merge_overrides" AS t0_r13, "lookup_keys"."avoid_duplicates" AS t0_r14, "lookup_keys"."omit" AS t0_r15, "lookup_keys"."type" AS t0_r16, "lookup_keys"."merge_default" AS t0_r17, "lookup_keys"."hidden_value" AS t0_r18, "environment_classes"."puppetclass_id" AS t1_r0, "environment_classes"."environment_id" AS t1_r1, "environment_classes"."id" AS t1_r2, "environment_classes"."puppetclass_lookup_key_id" AS t1_r3 FROM "lookup_keys" INNER JOIN "environment_classes" ON "environment_classes"."puppetclass_lookup_key_id" = "lookup_keys"."id" INNER JOIN "lookup_values" ON "lookup_values"."lookup_key_id" = "lookup_keys"."id" WHERE "lookup_keys"."type" IN ('PuppetclassLookupKey') AND "lookup_keys"."override" = $1 AND "environment_classes"."environment_id" = $2 AND "environment_classes"."puppetclass_id" IN (70, 191, 94, 347, 230, 10, 128, 772, 278, 50, 163, 636, 310, 131, 616, 116, 7)  ORDER BY lookup_keys.key  [["override", true], ["environment_id", 4]]
2018-05-03T07:49:19 2b02b61d [sql] [D]   CACHE (0.0ms)  SELECT "lookup_values".* FROM "lookup_values" WHERE "lookup_values"."match" IN ('fqdn=sonya-lawrence.server.de', 'hostgroup=', 'os=SLES 12.2', 'domain=server.de', 'environment=devel') AND "lookup_values"."lookup_key_id" IN (SELECT "lookup_keys"."id" FROM "lookup_keys" INNER JOIN "environment_classes" ON "environment_classes"."puppetclass_lookup_key_id" = "lookup_keys"."id" WHERE "lookup_keys"."type" IN ('PuppetclassLookupKey') AND "lookup_keys"."override" = $1 AND "environment_classes"."environment_id" = $2 AND "environment_classes"."puppetclass_id" IN (70, 191, 94, 347, 230, 10, 128, 772, 278, 50, 163, 636, 310, 131, 616, 116, 7)  ORDER BY lookup_keys.key)  [["override", true], ["environment_id", 4]]
2018-05-03T07:49:19 2b02b61d [sql] [D]   CACHE (0.0ms)  SELECT "lookup_keys".* FROM "lookup_keys" WHERE "lookup_keys"."id" IN (406)  ORDER BY lookup_keys.key

The full querylog is available under https://nopaste.linux-dev.org/?1181191

Sorry, linux-dev cut the querylog. Here the full querylog: https://pastebin.com/cNVReuPp

So out of the 3 seconds, nothing was spent in view or active record. Looks like Ruby is doing some iterations somewhere.

Completed 200 OK in 3144ms (Views: 0.5ms | ActiveRecord: 30.8ms)

Seems like a Ruby code regression, we should sniff around lookup key/values.

also worth mentioning the very large amount of cached sql queries, maybe we
need to store it in some local variable instead of going to active record?

Don’t want to blame anybody, but in 1.16 we introduced redesign of host info, providers were introduced to allow extending of ENC data. This was part of:

http://projects.theforeman.org/issues/15711

This possibly could be root cause, unfortunately this is rather complex code change, I am unable to tell and this needs further analysis. Creating a simple benchmark and running it before/after would tell us perhaps.

Thank you very much for your analysis. You can reproduce this with the ENC script (as provided here: Foreman :: Manual, e.g. $ time ./node.rb $(hostname -f) > /dev/null) or just by clicking on “YAML” in the Foreman web UI.

With the depth of this topic and with these Foreman and Ruby internals, I am at my wit’s end.

Do you need any further debug information or do you have any tips on how to trace this problem at Ruby level?

This is far from an analysis, I was just looking into git and trying to figure out files which changed in 1.16. I might be very wrong.

Can you describe in more detail the host you are experiencing slow performance with? I mean, can you pastebin the ENC output actually? Make sure all sensitive data is redacted, but it would be useful to see how much data are we talking about. Also, the benchmark would use the same or similar input.

You can find an example here: https://pastebin.com/Kc8shXgw
Due to the anonymization the yaml might have syntax errors - sorry about that.

Meanwhile we did the following test with a host and the smart class parameters:

  • Host with 17 included puppet classes with a few activated smart class parameters (configure -> Smart Class Parameters -> “Override” selected): ~ 800ms
  • Host with 17 included puppet classes with some more activated smart class parameters (configure -> Smart Class Parameters -> “Override”): ~ 3000ms
  • Host with 17 include puppet classes with some more activated smart class parameters (configure -> Smart Class Parameters -> “Override”) and overwritten on the host: ~ 6000ms
1 Like

Thanks for the snippet, that is fine. Just please confirm that the amount of data is roughly correct. This does not look like a huge ENC output.

Class and smart class params are a bit complicated code I haven’t looked into yet, so I will let some more experienced devs to comment on that - everything above one second is IMHO bad performance.

Yes, the amount of data is roughly correct. Of course the public ssh keys are a bit longer :P, but the number of parameters is correct.

We have the same problem.

We had a separate post before the community vm problem - but really our data belongs here.

@ohadlevy was replying to me on that other thread - but I think this data belongs on this thread now since it looks like the a related problem.

We upgraded from 1.15 to 1.17 and suffered huge performance degradation which rendered foreman unusable.

We downgraded to 1.16 and had the same problem. It was fixed when we went back to 1.15.

In 1.15 a single node query – /node/node.name?format=yml took 0.8 seconds.

In 1.16 it takes around 19 seconds which renders foreman unusable because it’s burning 100% of one CPU during this time. So we end up burning all CPUs on our 7 foreman servers (which only run foreman) and our puppet runs start failing because of time outs.

I believe the same problem was true for 1.17 but I did not see the logs personally to confirm it. All I know is the foreman servers were burning all of the CPUs they had handling the /node/node.name?format=yml endpoint.

Overall we have around 1,000 classes, 4,000 smart class variables, 8,000 hosts, 18 puppet environments, 1 smart variable.

The ENC for the host from the log entries below has 67 classes, 148 class parameters and 13 parameters.

The following is an extract from a single host request to /node/single.node.name?format=yml .

I used the request ID to grep every line from the log for the single /node call, cut off the time stamps and did a sort | uniq -c .

The number on the front of each line is the number of times that line appeared in the log for a single /node/ request for one node.

    737 [sql] [D]   CACHE (0.0ms)  SELECT "parameters".* FROM "parameters" WHERE "parameters"."type" IN ('CommonParameter')  ORDER BY parameters.name
    738 [sql] [D]   CACHE (0.0ms)  SELECT "parameters".* FROM "parameters" WHERE "parameters"."type" IN ('DomainParameter') AND "parameters"."reference_id" = $1  ORDER BY parameters.name  [["reference_id", 2]]
    740 [sql] [D]   CACHE (0.0ms)  SELECT "parameters".* FROM "parameters" WHERE "parameters"."type" IN ('GroupParameter') AND "parameters"."reference_id" = $1  ORDER BY parameters.name  [["reference_id", 68]]
    737 [sql] [D]   CACHE (0.0ms)  SELECT "parameters".* FROM "parameters" WHERE "parameters"."type" IN ('GroupParameter') AND "parameters"."reference_id" = $1  ORDER BY parameters.name  [["reference_id", 7]]
    736 [sql] [D]   CACHE (0.0ms)  SELECT "parameters".* FROM "parameters" WHERE "parameters"."type" IN ('GroupParameter') AND "parameters"."reference_id" = $1  ORDER BY parameters.name  [["reference_id", 9]]
    736 [sql] [D]   CACHE (0.0ms)  SELECT "parameters".* FROM "parameters" WHERE "parameters"."type" IN ('HostParameter') AND "parameters"."reference_id" = $1  ORDER BY parameters.name  [["reference_id", 1047]]
    737 [sql] [D]   CACHE (0.0ms)  SELECT "parameters".* FROM "parameters" WHERE "parameters"."type" IN ('OsParameter') AND "parameters"."reference_id" = $1  ORDER BY parameters.name  [["reference_id", 1]]

I can do the same logging on 1.15 by comparison if that would help and I can provide the full log from 1.16. We can run up a test 1.17 or 1.16 and put our data into it for testing as well.

We can not upgrade to 1.16 or 1.17 at the moment because foreman is unusable due to this performance problem.

I believe I’m also suffering from this issue. ~13,000 hosts, ~800 classes per environment ~20 environments. Prolly 20 to 120 smartclass params per host.

Unlike others we did not fall back though. We have 18 puppet masters with 16 cores and 55gb RAM each, and allow something like 24 max passenger processes per host. Previously this gave us way overkill redundancy with the ability to lose 66 percent of our puppet masters, and still function fine… Now it’s a lot less, but it can “keep up” overall. Hope these numbers help anyone who wants to upgrade, and has virtual resources to spare to scale out for the time being…

Definitely notice the yaml button is very slow in the UI compared to 1.15.x though being the easiest way to see something is amiss. Would love to see if this can be fixed… Once we figure out what it is :slight_smile:

Can anyone of you share their db? It would make developers life a lot easier to reproduce the issue.

We have what looks like a similar problem.

Ignore the api build problems in that orinal post.

Ignore the above - it’s from the restore of the data. I can’t delete the post.