RFC: Installer Output and Logging Format

I have been looking into changing the standard progress output of the installer along with enhancements to the log output. There are a few ways we can go with this as well as some smaller changes to increase the debuggability of the log files.

There are a number of items to unpack and weigh in on here. I would really appreciate if folks gave in-depth thoughts or even just simple replies with which aspects they like to help guide the changes in this area.

Progress Output

When we think about running the installer in it’s standard way (no verbose mode or modification to log levels) there are two ways I think we could approach it. The current method of using a progress bar but breaking out the steps to show progress step by step or a log-style approach.

Option 1: Progress Bar Style

The first option is continuing with the progress bar theme but breaking this out by stage of the installer, showing progress for each stage and elapsed time and then moving on to the next step without erasing the previous one from the screen. This gives more of a sequence feel to the process and could include things like total number of steps and what step the process is currently on.

Added this based on feedback to show a progress style output but using step counts rather than bars:

Option 2: Log Based

The second option is to unify the log file output with the progress output from running the installer to be the same output in both places. This would keep the output the same no matter where a user looked and be affected by verbosity levels where a user wanted to run installation with more information, or run it in noop mode but with more details. This method is not as nice to look at but is straight forward.

Log Formatting Changes

The current installer log output and formatting prints the following information:

  • date and time
  • log level
  • logger name
  • message

For this proposal, I’d like to modify both what gets printed and the formatting of that data as it appears in the log file or on the screen when using verbose mode.

Log Data

The only proposed change here is to drop the name of the logger (e.g. verbose, main) since that has no contextual value to a user debugging. Instead, replace this with the stage the installer is at to better understand where an error might be coming from. For example, this would denote if the installer was running a boot hook, pre hook, puppet apply or post install hook. Further, actions within a hook could denote themselves as a stage and provide further context. For example, package unlocking could identify itself as a stage, e.g.

2020-08-13 09:14:44 INFO [init] Warming up
2020-08-13 09:14:45 INFO [boot] Executing boot hooks
2020-08-13 09:14:46 INFO [package_unlock] Package versions are being unlocked
2020-08-13 09:14:46 INFO [pre] Executing pre configure hooks

Log Format

This topic area relates to how the data is arranged within the log at a line by line level. This aims to take into account readability, the types of data we want to display and some bit of reading and thinking of formats that may enhance visual parsing.

Current Format:

[INFO  2020-08-13T09:10:51 init] Warming up
[INFO  2020-08-13T09:10:52 boot] Executing boot hooks
[INFO  2020-08-13T09:10:53 package_unlock] Package versions are being unlocked
[INFO  2020-08-13T09:10:53 pre] Executing pre configure hooks
[WARN 2020-08-12T20:45:50 configure]  Stage[main]/Foreman_proxy::Module::Logs/Foreman_proxy::Module[logs]/Foreman_proxy::Feature[Lo
gs]/Datacat_fragment[foreman_proxy::enabled_features::Logs]: Skipping because of failed dependencies
[ERROR 2020-08-13T09:10:54 configure] Puppet apply error

Here I have presented 3 options from my reading to think about and spark conversation. Feel free to either reply and say you like one, to propose a new formatting all together or propose some tweaks to one.

Also, in the below option examples I have chosen to spit the date to be separated by a space rather than the T character that is part of the ISO 8601 format. I did this to show a different possibility from our current format and because I personally find it easier to read when debugging. Please do share regarding this as well.

Option 1

Move the date to the left, and only wrap the stage in brackets to emphasize.

2020-08-13 09:13:38 INFO  [init] Warming up
2020-08-13 09:13:39 INFO  [boot] Executing boot hooks
2020-08-13 09:13:40 INFO  [package_unlock] Package versions are being unlocked
2020-08-13 09:13:40 INFO  [pre] Executing pre configure hooks
2020-08-13 09:13:40 WARN  [configure] Stage[main]/Foreman_proxy::Module::Logs/Foreman_proxy::Module[logs]/Foreman_proxy::Feature[Lo
gs]/Datacat_fragment[foreman_proxy::enabled_features::Logs]: Skipping because of failed dependencies
2020-08-13 09:13:40 ERROR [configure] Puppet apply error

Option 2

Space separators between each data point

2020-08-13 | 09:13:38 | INFO  | init           | Warming up
2020-08-13 | 09:13:39 | INFO  | boot           | Executing boot hooks
2020-08-13 | 09:13:40 | INFO  | package_unlock | Package versions are being unlocked
2020-08-13 | 09:13:40 | INFO  | pre            | Executing pre configure hooks
2020-08-13 | 09:13:40 | WARN  | configure      | Stage[main]/Foreman_proxy::Module::Logs/Foreman_proxy::Module[logs]/Foreman_proxy::Feature[Lo
gs]/Datacat_fragment[foreman_proxy::enabled_features::Logs]: Skipping because of failed dependencies
2020-08-13 | 09:13:40 | ERROR | configure      | Puppet apply error

Option 3

Current format, drop brackets to only wrap stage

INFO  2020-08-13 09:13:38 [init] Warming up
INFO  2020-08-13 09:13:39 [boot] Executing boot hooks
INFO  2020-08-13 09:13:40 [package_unlock] Package versions are being unlocked
INFO  2020-08-13 09:13:40 [pre] Executing pre configure hooks
WARN  2020-08-13 09:13:40 [configure] Stage[main]/Foreman_proxy::Module::Logs/Foreman_proxy::Module[logs]/Foreman_proxy::Feature[Lo
gs]/Datacat_fragment[foreman_proxy::enabled_features::Logs]: Skipping because of failed dependencies
ERROR 2020-08-13 09:13:40 [configure] Puppet apply error

Option 1: Progress Bar Style

Can we even show reliable progress bars for each stage? If so, what are they based on? Puppet knowing the remaining number of “steps” to perform in each stage? I’d very much like to avoid windows style progress bars (xkcd 612), if possible

Even if we could display reliable progress bars, I’d probably still favor the log based output. I think it is not a bad idea to have logs and the live output from the command visually similar. (here I assume that even with progress bars the logs would be log based.)

Completely agree with the log data paragraph.

As far as log format goes, I wouldn’t do anything too crazy and would stick with 1 or 3, possibly in this order.

On a semi-serious note, my concern with 2 is that there would be an insanely long stage name and when you looked at the logs, you would see the timestamp, log level, stage and nothing, unless you have one of those ultra-wide monitors, making pre-processing with scripts almost necessary. Even the package unlocks pad things quite a lot and if you’d grep -v them away, you would waste a lot of screen space.

In vanilla Foreman most hooks have trivial runtime. The only exception is the PostgreSQL upgrade on EL7. However, that would be on huge step so a progress bar isn’t all that useful.

The majority of the time is spent in gathering defaults (which does a Puppet run in noop mode) and the application step (which runs Puppet without noop). So I don’t think the additional progress bars don’t have that much value.

It’s a good point and as @ekohl also notes most hooks are short, minus the occasional upgrade related hook that generally ends up unpredictable time wise. I do think there is an argument to be made for showing users that something is happening (i.e. progress) to prevent a feeling of “is this thing stuck?” and gauging somewhat where the process is at. I added a screenshot of another “progress” style where we can show number of steps left in a stage and still show runtime.

A suggestion from @Marek_Hulan has been to indicate a number for each stage and how many stages are thus left. Something akin to the left side here:

[1/7] Warming up: (1/1) 00:00:00

[2/7] Executing boot hooks: (5/5) 00:00:00

[3/7] Package versions are being unlocked: (1.0/1) 00:00:01

[4/7] Executing pre hooks: (3/3) 00:00:01

[5/7] Configuring system: (700/700) 00:00:19

[6/7] Executing post install hooks: (5/5) 00:00:19

[7/7] Package versions are being locked: (1.0/1) 00:00:200

* Foreman is running at https://foreman.example.com

* Login credentials -- username: admin, password: changeme

The full log is at /var/log/foreman-installer/foreman.log

Assuming we can calculate that completely at boot time.

While I would not have complaint with the current solution, I am happy with some improvement and every of the options looks fine.

For the logs I would like to add the requirement that it should be easily parseable for tools like Elastic Stack or Graylog. Because of this I do not like Option 2 very much, as the timestamp is split, the number of whitespaces differs and the pipe as divider is just overhead. It is not unparseable but at least makes it more difficult and the message size is increased.
The others are better, still not sure if I like the Loglevel column being the same size because INFO and WARN add an additional whitespace to have same length as ERROR. If it helps readability I am fine, as it is still parseable.

This one is tricky as it leads to shortening (e.g. ERRO) but does not address other log levels such as DEBUG (which shortened would be DEBU).

Is it the message portion of the log file that makes the parsing more difficult? Does detecting the ] prior to the message portion help? Would a more clear delineator before the message portion help?

It is totally fine with having different length like ERROR, DEBUG and INFO, adding then a different amount of whitespaces makes it a very likely cause of parser error because you have to realize that the format is not only one whitespace as separator but a flexible number of whitespaces.

Following up here, I appreciate the feedback so far. I chose to go with Option 1 and have opened some implementation PRs:

2 Likes

Log Format

Option 4?

Log directly to journalctl. With additional tagging for stage.

Hi, @Dirk pulled me into the circle with this one. :slight_smile:

He started to create Logstash rules for the new format and asked me about feedback. Running it through Logstash should be representative about how well the new format is parseable in general. Based on the short example in https://github.com/theforeman/kafo/pull/276 I can say, it works quite well with parsing the new logs with logmanagement tools.

Here’s some example code if you want to try it yourself. Filebeat is the agent to collect logs from a host, Logstash is a central processor who parses and forwards the logs. Please note that I just used example paths and placed all files into /tmp. Also the output goes into a file in JSON-format just for debugging. In a real life example you would put it into Elasticsearch and search and query it with Kibana.

Filebeat.yml

filebeat.inputs:
- type: log
  enabled: true
  paths:
    - /tmp/foreman.log
  multiline.pattern: '^[0-9]{4}-[0-9]{2}-[0-9]{2}'
  multiline.negate: true
  multiline.match: after
output.logstash:
  hosts: ["localhost:5044"]

And a matching Logstash pipeline:

input {
  beats {
    port => 5044
  }
}
filter {
  grok {
    match => ["message","%{DATESTAMP:[@metadata][foremandate]} %{LOGLEVEL:[log][level]}%{SPACE}\[%{DATA:[foreman][stage]}\] %{GREEDYDATA:message}"]
    id => "foreman_log"
    tag_on_failure => ["_grokparsefailure","foreman_grok_failed"]
    overwrite => "message"
  }
  date {
    match => ["[@metadata][foremandate]", "yyyy-MM-dd HH:mm:ss"]
    id => "foreman_date"
    tag_on_failure => ["_dateparsefailure","foreman_date_failed"]
  }
}
output {
  file {
    path => "/tmp/foreman-output"
    codec => "json_lines"
  }
}

Dirk and me talked about providing this code as pre-configured Logstash pipeline you could just integrate into your Elastic Stack. This would make it quite easy to build upon it and e.g. use similar rules for other logs or parse the logs even more to get more detailed info from it. If that’s something you’d like to see, just let Dirk and me know. I’m afraid, I could miss the info because I’m not so often in this board (by now).

2 Likes