Scoped search across all implicit fields

Hello,

I’ve stumbled upon a performance issue on our dashboard. When you enter a query like “XXX” Foreman (or scoped_search should I say) comes up with something like (scroll to the right, keep going, keep going, see the LIKE/ILIKE statements? there you go!):

[6] pry(main)> Host.authorized(:view_hosts, Host).search_for("XXX")
[sql|D||]   SQL (1.3ms)  SELECT "hosts"."id" AS t0_r0, "hosts"."name" AS t0_r1, "hosts"."last_compile" AS t0_r2, "hosts"."last_report" AS t0_r3, "hosts"."updated_at" AS t0_r4, "hosts"."created_at" AS t0_r5, "hosts"."root_pass" AS t0_r6, "hosts"."architecture_id" AS t0_r7, "hosts"."operatingsystem_id" AS t0_r8, "hosts"."environment_id" AS t0_r9, "hosts"."ptable_id" AS t0_r10, "hosts"."medium_id" AS t0_r11, "hosts"."build" AS t0_r12, "hosts"."comment" AS t0_r13, "hosts"."disk" AS t0_r14, "hosts"."installed_at" AS t0_r15, "hosts"."model_id" AS t0_r16, "hosts"."hostgroup_id" AS t0_r17, "hosts"."owner_id" AS t0_r18, "hosts"."owner_type" AS t0_r19, "hosts"."enabled" AS t0_r20, "hosts"."puppet_ca_proxy_id" AS t0_r21, "hosts"."managed" AS t0_r22, "hosts"."use_image" AS t0_r23, "hosts"."image_file" AS t0_r24, "hosts"."uuid" AS t0_r25, "hosts"."compute_resource_id" AS t0_r26, "hosts"."puppet_proxy_id" AS t0_r27, "hosts"."certname" AS t0_r28, "hosts"."image_id" AS t0_r29, "hosts"."organization_id" AS t0_r30, "hosts"."location_id" AS t0_r31, "hosts"."type" AS t0_r32, "hosts"."otp" AS t0_r33, "hosts"."realm_id" AS t0_r34, "hosts"."compute_profile_id" AS t0_r35, "hosts"."provision_method" AS t0_r36, "hosts"."grub_pass" AS t0_r37, "hosts"."discovery_rule_id" AS t0_r38, "hosts"."global_status" AS t0_r39, "hosts"."lookup_value_matcher" AS t0_r40, "hosts"."pxe_loader" AS t0_r41, "hosts"."initiated_at" AS t0_r42, "hosts"."build_errors" AS t0_r43, "hosts"."salt_proxy_id" AS t0_r44, "hosts"."salt_environment_id" AS t0_r45, "models"."id" AS t1_r0, "models"."name" AS t1_r1, "models"."info" AS t1_r2, "models"."created_at" AS t1_r3, "models"."updated_at" AS t1_r4, "models"."vendor_class" AS t1_r5, "models"."hardware_model" AS t1_r6, "hostgroups"."id" AS t2_r0, "hostgroups"."name" AS t2_r1, "hostgroups"."created_at" AS t2_r2, "hostgroups"."updated_at" AS t2_r3, "hostgroups"."environment_id" AS t2_r4, "hostgroups"."operatingsystem_id" AS t2_r5, "hostgroups"."architecture_id" AS t2_r6, "hostgroups"."medium_id" AS t2_r7, "hostgroups"."ptable_id" AS t2_r8, "hostgroups"."root_pass" AS t2_r9, "hostgroups"."puppet_ca_proxy_id" AS t2_r10, "hostgroups"."use_image" AS t2_r11, "hostgroups"."image_file" AS t2_r12, "hostgroups"."ancestry" AS t2_r13, "hostgroups"."vm_defaults" AS t2_r14, "hostgroups"."subnet_id" AS t2_r15, "hostgroups"."domain_id" AS t2_r16, "hostgroups"."puppet_proxy_id" AS t2_r17, "hostgroups"."title" AS t2_r18, "hostgroups"."realm_id" AS t2_r19, "hostgroups"."compute_profile_id" AS t2_r20, "hostgroups"."grub_pass" AS t2_r21, "hostgroups"."lookup_value_matcher" AS t2_r22, "hostgroups"."subnet6_id" AS t2_r23, "hostgroups"."pxe_loader" AS t2_r24, "hostgroups"."description" AS t2_r25, "hostgroups"."compute_resource_id" AS t2_r26, "hostgroups"."salt_proxy_id" AS t2_r27, "hostgroups"."salt_environment_id" AS t2_r28, "domains"."id" AS t3_r0, "domains"."name" AS t3_r1, "domains"."fullname" AS t3_r2, "domains"."created_at" AS t3_r3, "domains"."updated_at" AS t3_r4, "domains"."dns_id" AS t3_r5, "realms"."id" AS t4_r0, "realms"."name" AS t4_r1, "realms"."realm_type" AS t4_r2, "realms"."realm_proxy_id" AS t4_r3, "realms"."created_at" AS t4_r4, "realms"."updated_at" AS t4_r5, "environments"."id" AS t5_r0, "environments"."name" AS t5_r1, "environments"."created_at" AS t5_r2, "environments"."updated_at" AS t5_r3, "architectures"."id" AS t6_r0, "architectures"."name" AS t6_r1, "architectures"."created_at" AS t6_r2, "architectures"."updated_at" AS t6_r3, "compute_resources"."id" AS t7_r0, "compute_resources"."name" AS t7_r1, "compute_resources"."description" AS t7_r2, "compute_resources"."url" AS t7_r3, "compute_resources"."user" AS t7_r4, "compute_resources"."password" AS t7_r5, "compute_resources"."uuid" AS t7_r6, "compute_resources"."type" AS t7_r7, "compute_resources"."created_at" AS t7_r8, "compute_resources"."updated_at" AS t7_r9, "compute_resources"."attrs" AS t7_r10, "compute_resources"."caching_enabled" AS t7_r11, "compute_resources"."http_proxy_id" AS t7_r12, "compute_resources"."domain" AS t7_r13, "images"."id" AS t8_r0, "images"."operatingsystem_id" AS t8_r1, "images"."compute_resource_id" AS t8_r2, "images"."architecture_id" AS t8_r3, "images"."uuid" AS t8_r4, "images"."username" AS t8_r5, "images"."name" AS t8_r6, "images"."created_at" AS t8_r7, "images"."updated_at" AS t8_r8, "images"."iam_role" AS t8_r9, "images"."user_data" AS t8_r10, "images"."password" AS t8_r11, "operatingsystems"."id" AS t9_r0, "operatingsystems"."major" AS t9_r1, "operatingsystems"."name" AS t9_r2, "operatingsystems"."minor" AS t9_r3, "operatingsystems"."nameindicator" AS t9_r4, "operatingsystems"."created_at" AS t9_r5, "operatingsystems"."updated_at" AS t9_r6, "operatingsystems"."release_name" AS t9_r7, "operatingsystems"."type" AS t9_r8, "operatingsystems"."description" AS t9_r9, "operatingsystems"."password_hash" AS t9_r10, "operatingsystems"."title" AS t9_r11, "primary_interfaces_hosts"."id" AS t10_r0, "primary_interfaces_hosts"."mac" AS t10_r1, "primary_interfaces_hosts"."ip" AS t10_r2, "primary_interfaces_hosts"."type" AS t10_r3, "primary_interfaces_hosts"."name" AS t10_r4, "primary_interfaces_hosts"."host_id" AS t10_r5, "primary_interfaces_hosts"."subnet_id" AS t10_r6, "primary_interfaces_hosts"."domain_id" AS t10_r7, "primary_interfaces_hosts"."attrs" AS t10_r8, "primary_interfaces_hosts"."created_at" AS t10_r9, "primary_interfaces_hosts"."updated_at" AS t10_r10, "primary_interfaces_hosts"."provider" AS t10_r11, "primary_interfaces_hosts"."username" AS t10_r12, "primary_interfaces_hosts"."password" AS t10_r13, "primary_interfaces_hosts"."virtual" AS t10_r14, "primary_interfaces_hosts"."link" AS t10_r15, "primary_interfaces_hosts"."identifier" AS t10_r16, "primary_interfaces_hosts"."tag" AS t10_r17, "primary_interfaces_hosts"."attached_to" AS t10_r18, "primary_interfaces_hosts"."managed" AS t10_r19, "primary_interfaces_hosts"."mode" AS t10_r20, "primary_interfaces_hosts"."attached_devices" AS t10_r21, "primary_interfaces_hosts"."bond_options" AS t10_r22, "primary_interfaces_hosts"."primary" AS t10_r23, "primary_interfaces_hosts"."provision" AS t10_r24, "primary_interfaces_hosts"."compute_attributes" AS t10_r25, "primary_interfaces_hosts"."ip6" AS t10_r26, "primary_interfaces_hosts"."subnet6_id" AS t10_r27, "subnets"."id" AS t11_r0, "subnets"."network" AS t11_r1, "subnets"."mask" AS t11_r2, "subnets"."priority" AS t11_r3, "subnets"."name" AS t11_r4, "subnets"."vlanid" AS t11_r5, "subnets"."created_at" AS t11_r6, "subnets"."updated_at" AS t11_r7, "subnets"."dhcp_id" AS t11_r8, "subnets"."tftp_id" AS t11_r9, "subnets"."gateway" AS t11_r10, "subnets"."dns_primary" AS t11_r11, "subnets"."dns_secondary" AS t11_r12, "subnets"."from" AS t11_r13, "subnets"."to" AS t11_r14, "subnets"."dns_id" AS t11_r15, "subnets"."boot_mode" AS t11_r16, "subnets"."ipam" AS t11_r17, "subnets"."discovery_id" AS t11_r18, "subnets"."type" AS t11_r19, "subnets"."description" AS t11_r20, "subnets"."template_id" AS t11_r21, "subnets"."mtu" AS t11_r22, "subnets"."httpboot_id" AS t11_r23, "subnet6s_hosts"."id" AS t12_r0, "subnet6s_hosts"."network" AS t12_r1, "subnet6s_hosts"."mask" AS t12_r2, "subnet6s_hosts"."priority" AS t12_r3, "subnet6s_hosts"."name" AS t12_r4, "subnet6s_hosts"."vlanid" AS t12_r5, "subnet6s_hosts"."created_at" AS t12_r6, "subnet6s_hosts"."updated_at" AS t12_r7, "subnet6s_hosts"."dhcp_id" AS t12_r8, "subnet6s_hosts"."tftp_id" AS t12_r9, "subnet6s_hosts"."gateway" AS t12_r10, "subnet6s_hosts"."dns_primary" AS t12_r11, "subnet6s_hosts"."dns_secondary" AS t12_r12, "subnet6s_hosts"."from" AS t12_r13, "subnet6s_hosts"."to" AS t12_r14, "subnet6s_hosts"."dns_id" AS t12_r15, "subnet6s_hosts"."boot_mode" AS t12_r16, "subnet6s_hosts"."ipam" AS t12_r17, "subnet6s_hosts"."discovery_id" AS t12_r18, "subnet6s_hosts"."type" AS t12_r19, "subnet6s_hosts"."description" AS t12_r20, "subnet6s_hosts"."template_id" AS t12_r21, "subnet6s_hosts"."mtu" AS t12_r22, "subnet6s_hosts"."httpboot_id" AS t12_r23, "provision_interfaces_hosts"."id" AS t13_r0, "provision_interfaces_hosts"."mac" AS t13_r1, "provision_interfaces_hosts"."ip" AS t13_r2, "provision_interfaces_hosts"."type" AS t13_r3, "provision_interfaces_hosts"."name" AS t13_r4, "provision_interfaces_hosts"."host_id" AS t13_r5, "provision_interfaces_hosts"."subnet_id" AS t13_r6, "provision_interfaces_hosts"."domain_id" AS t13_r7, "provision_interfaces_hosts"."attrs" AS t13_r8, "provision_interfaces_hosts"."created_at" AS t13_r9, "provision_interfaces_hosts"."updated_at" AS t13_r10, "provision_interfaces_hosts"."provider" AS t13_r11, "provision_interfaces_hosts"."username" AS t13_r12, "provision_interfaces_hosts"."password" AS t13_r13, "provision_interfaces_hosts"."virtual" AS t13_r14, "provision_interfaces_hosts"."link" AS t13_r15, "provision_interfaces_hosts"."identifier" AS t13_r16, "provision_interfaces_hosts"."tag" AS t13_r17, "provision_interfaces_hosts"."attached_to" AS t13_r18, "provision_interfaces_hosts"."managed" AS t13_r19, "provision_interfaces_hosts"."mode" AS t13_r20, "provision_interfaces_hosts"."attached_devices" AS t13_r21, "provision_interfaces_hosts"."bond_options" AS t13_r22, "provision_interfaces_hosts"."primary" AS t13_r23, "provision_interfaces_hosts"."provision" AS t13_r24, "provision_interfaces_hosts"."compute_attributes" AS t13_r25, "provision_interfaces_hosts"."ip6" AS t13_r26, "provision_interfaces_hosts"."subnet6_id" AS t13_r27, "discovery_rules"."id" AS t14_r0, "discovery_rules"."name" AS t14_r1, "discovery_rules"."search" AS t14_r2, "discovery_rules"."hostgroup_id" AS t14_r3, "discovery_rules"."hostname" AS t14_r4, "discovery_rules"."max_count" AS t14_r5, "discovery_rules"."priority" AS t14_r6, "discovery_rules"."enabled" AS t14_r7, "discovery_rules"."created_at" AS t14_r8, "discovery_rules"."updated_at" AS t14_r9 FROM "hosts" LEFT OUTER JOIN "models" ON "models"."id" = "hosts"."model_id" LEFT OUTER JOIN "hostgroups" ON "hostgroups"."id" = "hosts"."hostgroup_id" LEFT OUTER JOIN "nics" ON "nics"."host_id" = "hosts"."id" AND "nics"."primary" = ? LEFT OUTER JOIN "domains" ON "domains"."id" = "nics"."domain_id" LEFT OUTER JOIN "realms" ON "realms"."id" = "hosts"."realm_id" LEFT OUTER JOIN "environments" ON "environments"."id" = "hosts"."environment_id" LEFT OUTER JOIN "architectures" ON "architectures"."id" = "hosts"."architecture_id" LEFT OUTER JOIN "compute_resources" ON "compute_resources"."id" = "hosts"."compute_resource_id" LEFT OUTER JOIN "images" ON "images"."id" = "hosts"."image_id" LEFT OUTER JOIN "operatingsystems" ON "operatingsystems"."id" = "hosts"."operatingsystem_id" LEFT OUTER JOIN "nics" "primary_interfaces_hosts" ON "primary_interfaces_hosts"."host_id" = "hosts"."id" AND "primary_interfaces_hosts"."primary" = ? LEFT OUTER JOIN "nics" "primary_interfaces_hosts_join" ON "primary_interfaces_hosts_join"."host_id" = "hosts"."id" AND "primary_interfaces_hosts_join"."primary" = ? LEFT OUTER JOIN "subnets" ON "subnets"."id" = "primary_interfaces_hosts_join"."subnet_id" AND "subnets"."type" = ? LEFT OUTER JOIN "nics" "primary_interfaces_hosts_join_2" ON "primary_interfaces_hosts_join_2"."host_id" = "hosts"."id" AND "primary_interfaces_hosts_join_2"."primary" = ? LEFT OUTER JOIN "subnets" "subnet6s_hosts" ON "subnet6s_hosts"."id" = "primary_interfaces_hosts_join_2"."subnet6_id" AND "subnet6s_hosts"."type" = ? LEFT OUTER JOIN "nics" "provision_interfaces_hosts" ON "provision_interfaces_hosts"."host_id" = "hosts"."id" AND "provision_interfaces_hosts"."provision" = ? LEFT OUTER JOIN "discovery_rules" ON "discovery_rules"."id" = "hosts"."discovery_rule_id" WHERE "hosts"."type" IN ('Host::Managed') AND (("hosts"."name" LIKE '%XXX%' OR "hosts"."comment" LIKE '%XXX%' OR "models"."name" LIKE '%XXX%' OR "hostgroups"."name" LIKE '%XXX%' OR "hostgroups"."title" LIKE '%XXX%' OR "domains"."name" LIKE '%XXX%' OR "realms"."name" LIKE '%XXX%' OR "environments"."name" LIKE '%XXX%' OR "architectures"."name" LIKE '%XXX%' OR "compute_resources"."name" LIKE '%XXX%' OR "images"."name" LIKE '%XXX%' OR "operatingsystems"."name" LIKE '%XXX%' OR "operatingsystems"."description" LIKE '%XXX%' OR "operatingsystems"."title" LIKE '%XXX%' OR "nics"."ip" LIKE '%XXX%' OR "subnets"."network" LIKE '%XXX%' OR "subnets"."name" LIKE '%XXX%' OR "subnets"."network" LIKE '%XXX%' OR "subnets"."name" LIKE '%XXX%' OR "hosts"."uuid" LIKE '%XXX%' OR "nics"."mac" LIKE '%XXX%' OR "discovery_rules"."name" LIKE '%XXX%')) ORDER BY "hosts"."name" ASC  [["primary", "t"], ["primary", "t"], ["primary", "t"], ["type", "Subnet::Ipv4"], ["primary", "t"], ["type", "Subnet::Ipv6"], ["provision", "t"]]

However if you provide a field explicitly, the search is much shorter and faster:

[7] pry(main)> Host.authorized(:view_hosts, Host).search_for("name = XXX")
[sql|D||]   Host::Managed Load (0.5ms)  SELECT "hosts".* FROM "hosts" WHERE "hosts"."type" IN ('Host::Managed') AND (("hosts"."name" = 'XXX')) ORDER BY "hosts"."name" ASC

Now, we are aware of this problem across the application, the most painful model is probably Host. The problem is, on dashboard when you enter “XXX” and then enable auto refresh (which was previously enabled by default until 1.21) you are essentially killing your Foreman instance since these huge queries which can process several seconds are called many times (depending on amount of your widgets - eight by default, more if you have plugins).

I would love to approach this problem, but I don’t have good ideas. It’s definitely useful to have “google-like” experience searching across most/some fields, but queries also must return something in reasonable time on bigger instances. Couple of thoughts:

  • Make even more fields explicit for host model, we probably now have about 50 per cent fields explicit (dropping them from the query without explicit field entered by user).
  • Stop doing “google-like” search for all scoped search queries. Probably too much radical, but I would like to hear opinion from the field how much useful this feature is.
  • The same but only for dashboard search. This sounds reasonable to me.
  • A nice javascript-generated warning when you enter a search without a field?
  • Other options?

Relevant reading: https://bugzilla.redhat.com/show_bug.cgi?id=1650662

If we want to address the dashboard specifically, there is also an option to cache hosts result between multiple widgets. I mean running the query once, and supplying the result to each widget, this should at least reduce the amount of queries to one per refresh.
Tagging @amirfefer to see if this aligns with react-base approach for dashboard.

Besides that, I would :+1: for shrinking the implicit fields list.

I’d vote for option 1 - further shrinking. For disabling google like search, I’d ask users. Perhaps use community poll? I always use explicit attribute when I’m searching. Dashboard search should remain consistent with other search fields I think. I don’t like warning, how could I tell whether I can or can not use this, it it warns me it may DoS my server.

We’re facing the “searching in too many implicit fields” issue in tasks too. I’d vote for reducing their number. Do we have any idea how much users rely on this?

I don’t think so.

Probably let’s be opinionated and strip most of the fields. Is this a good question for the poll @Gwmngilfen?

We would like to speed up hosts search, which fields do you often search:

  • host name
  • host ip
  • host mac
  • etc…

In theory yes, however I’m finalising the survey now for release today - I don’t see that there’s enough time to decide on a potential list of fields, and a free-text field won’t end well, trust me.

I could add something generic like “How much to you rely on the auto-completion of the search field?” as a 1-5 rating perhaps? I’ll need an answer in the next hour though.

If you can squeeze something in, but we are talking about explicit fields. Meaning, if you enter “test” which columns should foreman automatically search. Currently it’s still many: name, subnet.name, ip, mac etc etc. This has pretty much nothing to do with auto-completion I think.

Ah, I misunderstood. That could be tricky then, there’s a lot of potential options and the options change depending which page you’re on.

I don’t think we have time to figure out a good way to pry into that, but it’s definitely something we could talk to people about in-person over the next few weeks, as we move through conference season :wink:

While I would love to reduce the number of permitted implicit search fields, do keep in mind this may cause people’s usage to break, esp. with API scripts - which may rely on searches that will lead to incorrect results if they rely on implicit searching.

1 Like

I agree. It’s hard to decide without data. We need input from users or foreman-debug log analysis to say what people really use.

We could do something like :only_explicit => Settings[:incompatibility_mode] which defaults to true. And restore the behavior in next release if users complain they need to switch/drop if they don’t.

No change seems as the worst outcome. Users not using some attributes would suffer because we don’t know if someone use them.

I think @Roxanne_Hoover has plans for user interviews of some kind at FOSDEM, perhaps we could add this?

I can also take a look through the foreman-debug tarballs on the webserver; can someone supply a sample log line? Anything with “search” in the URL?

Please not more settings :smile: I’d be fine with being opinionated and dropping some fields if we give enough warning in advance. Also, it would be good to try and benchmark which of these fields are actually causing the majority of the slowness. I did do this for a few of them in https://github.com/theforeman/foreman/pull/4768 (and found some didn’t have a big impact) but further benchmarking could be useful to determine which of the fields would give us the biggest improvement.

yes, any url with search= in it would be a useful indication, though i’m not sure how big of a sample we’ve got there.

The setting would be removed the version after and wouldn’t have to be exposed in UI at all, it would only live in settings.yaml. But if this convinces you to be more opinionated, I’m fine with not adding it and just drop those fields :smile: agree we need to identify which cause the majority of the slowness.

Can I just check which thing we care about in the logs? Do we want the time taken to suggest the search (i.e. URLs like auto_complete_search=...) or do we care about the time taken to execute a particular search (i.e ./foo_controller/index?search=... with params "search => …)

I’m assuming the latter, do correct me if I’m wrong :slight_smile:

On that assumption, yeah, there’s not a lot to go on in the debug tarballs. Once you filter out foreman_tasks, notifications, job_invocations, name searches, API calls, etc, there’s only about 20 log lines with searches in, and there’s not a lot of variation in the parameters. I’ve pasted it below, but it’s not too useful I think :slight_smile:

URL Search mean time (ms) standard deviation (ms) number of log lines
/config_reports search"=>“eventful = true” 167.00 NA 1
/hosts search"=>“environment = production” 166.00 22.62742 2
/hosts search"=>“last_report > “30 minutes ago” and status.enabled = true and status.applied = 0 and status.failed = 0 and status.pending = 0” 120.50 113.84419 2
/hosts search"=>“not has last_report and status.enabled = true” 86.50 75.66043 2
/hosts search"=>“not has location” 84.50 28.06540 4
/hosts search"=>“not has organization” 88.50 2.12132 2
/hosts search"=>“origin = Puppet and last_report > “35 minutes ago” and (status.applied > 0 or status.restarted > 0) and (status.failed = 0)” 301.50 277.89297 2
/puppetclasses search"=>"", “page”=>“1” 2304.00 NA 1
/puppetclasses search"=>“environment = production” 4742.00 NA 1
/templates/provisioning_templates search"=>“finish”, “page”=>“1” 70.00 NA 1
1 Like

This gives me only one search which uses implicit fields: finish in templates search. All the others look like A HREFs from our codebase. Looks like we need to optimize those as well :slight_smile:

Yeah, it’s not a great dataset. If anyone wants to share some log files, I can happily rerun the analysis :slight_smile:

As much as I would love to put this into telemetry, search query is not something I can easily send to PCP or Prometheus. However I can exclude search query and store model class name with the execution time. This gives us a nice picture of which models are slow rather than what users search for which is not very relevant.

This should be as easy as overriding search_for for all our models and putting two lines there. Any volunteers? :wink:

Hmmm was trying this but does not work. I guess this is too much of ActiveRecord magic for me:

require_relative 'concerns/audit_associations'
require_relative '../services/foreman/telemetry_helper'

class ApplicationRecord < ActiveRecord::Base
  self.abstract_class = true
  extend Foreman::TelemetryHelper
  include Foreman::TelemetryHelper

  extend AuditAssociations::AssociationsDefinitions

  # Rails use Notifications for own sql logging so we can override sql logger for orchestration
  def self.logger
    Foreman::Logging.logger('app')
  end

  def logger
    self.class.logger
  end

  module TelemetryEnhancedMethods
    def search_for(*args)
      telemetry_duration_histogram(:scoped_search, :ms, model: self.class.name) do
        require 'pry'; binding.pry
        super(*args)
      end
    end
  end

  def self.prepended(base)
    class << base
      prepend TelemetryEnhancedMethods
    end
  end
end