Orchestrator and workers in docker compose file do not perform remote execution tasks

Problem:

I have two problems with remote execution related to the setup of the orchestrator and workers as provided in your docker compose file.

  1. When running state.highstate via the Salt remote execution provider, the container “app” running the Foreman web GUI throws error message “The Dynflow world was not initialized yet. If your plugin uses it, make sure to call Rails.application.dynflow.require! in some initializer”. I can work around this by setting lazy_initialization= true in the dynflow configuration. In file config/application.rb there is in fact a call to @dynflow.require! but this seems not to suffice. What is the best way to ensure that dynflow is initialized properly?

  2. When simply starting the three containers app, orchestrator and worker (the latter both running script extras/dynflow-sidekiq.rb) as defined in the docker compose file they never pick up the remote execution tasks from Redis. It does work though when I go back before this commit to the docker compose file by @aruzicka and @ohadlevy, drop the orchestrator container and define the command for the worker container like this:

command: bundle exec rake dynflow:executor

I already found these explanations by @aruzicka but still cannot get it to work. Can anybody give me a hint why it works when I start the dynflow executor in the “old” way vs. running an orchestrator and some workers the “new” way?

Expected outcome:

Remote execution tasks are performed.

Foreman and Proxy versions:

Foreman 3.5-stable, smart-proxy 3.5-stable, smart_proxy_salt latest

Foreman and Proxy plugin versions:

foreman_remote_execution 8.1.2, foreman-tasks latest, foreman_salt latest

Distribution and version:

Docker container Quay with some extra packages and plugins installed

Other relevant data:

To be fair, I don’t really know. Last time I messed with foreman and containers was back in 2019 or so.

The initialization is a two step process. First step is call to ::Rails.application.dynflow.require!, the other is a call to ::Rails.application.dynflow.initialize!. The second step is called from a puma worker hook, could you check it is actually getting called?

In the old way, all the moving bits live in a single process and as such it is easier to get working. From what you’re describing it just sounds that in the new way, the parts cannot talk to each other, but why that happens, I cannot say right now.

@aruzicka Actually I had set FOREMAN_PUMA_WORKERS=0 because I had trouble to get debugging in RubyMine to work when running Puma in clustered mode inside the Docker container. So this explains why dynflow.initialize! never got called. Thanks for this hint!

I also tried to simplify the setup by not starting a worker container and starting the orchestrator container like this:

command: bundle exec sidekiq -r ./extras/dynflow-sidekiq.rb -c 1 -q dynflow_orchestrator,default,remote_execution

This also does not work as expected: I do see the jobs appearing in Redis (queue remote_execution) but they are never executed and stay in state pending. If you have another idea what may be the issue please comment. Otherwise I will stick with the workaround (using the old way) for the time being.

Thanks a lot and best regards

Jannis

Glad to hear you resolved at least a part of the issue.

Could you please share with me the Dockerfile (assuming that’s the route you took to add extra packages to the images) you used to build your images so I can follow along?

Sure.

Dockerfile:

# Base container that is used for both building and running the app
FROM registry.fedoraproject.org/fedora-minimal:33 as base
ARG RUBY_VERSION="2.7"
ARG NODEJS_VERSION="12"
ENV FOREMAN_FQDN=foreman.example.com
ENV FOREMAN_DOMAIN=example.com

RUN \
  echo -e "[nodejs]\nname=nodejs\nstream=${NODEJS_VERSION}\nprofiles=\nstate=enabled\n" > /etc/dnf/modules.d/nodejs.module && \
  echo -e "[ruby]\nname=ruby\nstream=${RUBY_VERSION}\nprofiles=\nstate=enabled\n" > /etc/dnf/modules.d/ruby.module && \
  microdnf install -y postgresql-libs ruby{,gems} rubygem-{rake,bundler} npm nc hostname ruby-devel zlib-devel sqlite-devel findutils vi \
  # needed for VNC/SPICE websockets
  python2-numpy && \
  microdnf clean all

ARG HOME=/home/foreman
WORKDIR $HOME
RUN groupadd -r foreman -f -g 0 && \
    useradd -u 1001 -r -g foreman -d $HOME -s /sbin/nologin \
    -c "Foreman Application User" foreman && \
    chown -R 1001:0 $HOME && \
    chmod -R g=u ${HOME}

# Add a script to be executed every time the container starts.
COPY extras/containers/entrypoint.sh /usr/bin/
RUN chmod +x /usr/bin/entrypoint.sh
ENTRYPOINT ["entrypoint.sh"]

# Temp container that download gems/npms and compile assets etc
FROM base as builder
ENV RAILS_ENV=production
ENV FOREMAN_APIPIE_LANGS=en
ENV BUNDLER_SKIPPED_GROUPS="test development openid libvirt journald facter console"

RUN \
  microdnf install -y redhat-rpm-config git \
    gcc-c++ make bzip2 gettext tar \
    libxml2-devel libcurl-devel ruby-devel \
    postgresql-devel && \
  microdnf clean all

ENV DATABASE_URL=nulldb://nohost

ARG HOME=/home/foreman
USER 1001
WORKDIR $HOME
COPY --chown=1001:0 . ${HOME}/
# Adding missing gems, for tzdata see https://bugzilla.redhat.com/show_bug.cgi?id=1611117
RUN echo gem '"tzinfo-data"' > bundler.d/container.rb
RUN bundle install --without "${BUNDLER_SKIPPED_GROUPS}" \
    --binstubs --clean --path vendor --jobs=5 --retry=3 && \
  rm -rf vendor/ruby/*/cache/*.gem && \
  find vendor/ruby/*/gems -name "*.c" -delete && \
  find vendor/ruby/*/gems -name "*.o" -delete
RUN \
  make -C locale all-mo && \
  mv -v db/schema.rb.nulldb db/schema.rb && \
  bundle exec rake assets:clean assets:precompile apipie:cache:index

RUN npm install --python=python2.7 --verbose --no-optional && \
  ./node_modules/webpack/bin/webpack.js --config config/webpack.config.js && npm run analyze && \
# cleanups
  rm -rf public/webpack/stats.json ./node_modules vendor/ruby/*/cache vendor/ruby/*/gems/*/node_modules bundler.d/nulldb.rb db/schema.rb && \
  bundle install --without "${BUNDLER_SKIPPED_GROUPS}" assets

USER 0
RUN chgrp -R 0 ${HOME} && \
    chmod -R g=u ${HOME}

USER 1001

FROM base

ARG HOME=/home/foreman
ENV RAILS_ENV=production
ENV RAILS_SERVE_STATIC_FILES=true
ENV RAILS_LOG_TO_STDOUT=true

USER 1001
WORKDIR ${HOME}
COPY --chown=1001:0 . ${HOME}/
COPY --from=builder /usr/bin/entrypoint.sh /usr/bin/entrypoint.sh
COPY --from=builder --chown=1001:0 ${HOME}/.bundle/config ${HOME}/.bundle/config
COPY --from=builder --chown=1001:0 ${HOME}/Gemfile.lock ${HOME}/Gemfile.lock
COPY --from=builder --chown=1001:0 ${HOME}/vendor/ruby ${HOME}/vendor/ruby
COPY --from=builder --chown=1001:0 ${HOME}/public ${HOME}/public
RUN echo gem '"tzinfo-data"' > bundler.d/container.rb && rm -rf bundler.d/nulldb.rb bin/spring

RUN date -u > BUILD_TIME

# Start the main process.
CMD bundle exec bin/rails server

EXPOSE 3000/tcp
EXPOSE 5910-5930/tcp

docker-compose file:

version: '3.4'
services:
  db:
    networks:
      - foreman
    environment:
      - POSTGRES_USER=foreman
      - POSTGRES_PASSWORD=foreman
      - POSTGRES_DATABASE=foreman
      - PGDATA=/var/lib/postgresql/data/pgdata
    hostname: db.example.com
    image: postgres:14
    ports:
      - "5432:5432"
    restart: always
    healthcheck:
      test: ["CMD-SHELL", "nc -z 127.0.0.1 5432 || exit 1"]
      interval: 30s
      timeout: 30s
      retries: 3
    #command: ["postgres", "-c", "log_statement=all","-c","log_duration=on","-c","log_min_messages=INFO"]
    command: ["postgres", "-c","log_min_messages=INFO"]
    volumes:
      - db:/var/lib/postgresql/data

  app: &app_base
    #image: quay.io/foreman/foreman:latest
    image: janniswarnat/foreman:3.5-m1
    #command: bundle exec rdebug-ide --host 0.0.0.0 --port 1235 --dispatcher-port 26163 -- bin/rails server -b 0.0.0.0
    command: bundle exec bin/rails server -b 0.0.0.0
    build:
      context: .
    networks:
      - foreman
      - salt
    environment:
      - DATABASE_URL=postgres://foreman:foreman@db/foreman?pool=5
      - RAILS_MAX_THREADS=5
      - RAILS_ENV=production
      - FOREMAN_FQDN=foreman.example.com
      - FOREMAN_DOMAIN=example.com
      - FOREMAN_RAILS_CACHE_STORE_TYPE=redis
      - FOREMAN_RAILS_CACHE_STORE_URLS=redis://redis-cache:6379/0
      - DYNFLOW_REDIS_URL=redis://redis-tasks:6379/0
      - REDIS_PROVIDER=DYNFLOW_REDIS_URL
      - FOREMAN_LOGGING_LEVEL=debug
      - FOREMAN_LOGGING_PRODUCTION_TYPE=stdout
      - FOREMAN_LOGGING_PRODUCTION_LAYOUT=pattern #json
      - FOREMAN_PUMA_WORKERS=1
    hostname: foreman.example.com
    links:
      - db
      - redis-cache
      - redis-tasks
    ports:
      - "3000:3000"
      - "5910-5930:5910-5930"
      - "1235:1235"
    restart: always
    healthcheck:
      test: ["CMD-SHELL", "nc -z 127.0.0.1 3000 || exit 1"]
      interval: 5m
      start_period: 1m

  orchestrator:
    <<: *app_base
    networks:
      - foreman
      - salt
    command: bundle exec sidekiq -r ./extras/dynflow-sidekiq.rb -c 1 -q dynflow_orchestrator,default,remote_execution
    #command: bundle exec rake dynflow:executor
    hostname: orchestrator.example.com
    ports: []

#  worker:
#    <<: *app_base
#    networks:
#      - foreman
#      - salt
#    command: bundle exec sidekiq -r ./extras/dynflow-sidekiq.rb -c 15 -q default,remote_execution
#    #command: bundle exec rake dynflow:executor
#    hostname: worker.example.com
#    ports: []

  redis-cache:
    image: redis
    networks:
      - foreman
    ports:
      - "6380:6379"

  redis-tasks:
    image: redis
    networks:
      - foreman
    command: redis-server --appendonly yes
    volumes:
      - redis-persistent:/data
    ports:
      - "6379:6379"

volumes:
  db:
  redis-persistent:

networks:
  salt:
    external: true
  foreman:

Oh, yeah, thank you. We were using wrong syntax for specifying multiple queues.

See:

It would be great if you could check the PR out and leave a comment there if it does the trick for you

This solves my issue :slight_smile:. Thanks a lot!