Gracefully shutting down worker nodes on Elastic Beanstalk

13 September 2015

At work, we recently came across what should have been a straight-forward problem, that turned out to be anything but. We wanted to host a worker – a program that consumes jobs from a queue – on Amazon’s Elastic Beanstalk. We wanted to do this largely for legacy reasons: our service was running on EB and we wanted to mitigate the operational costs of moving to something else. Given the effort expended to get it to work, this may well have been the wrong move, but by doing so we did manage to delay the inevitable.

One of the nice things about Elastic Beanstalk is that it manages things like taking down (or upgrading, or restarting) a node so that it doesn’t impact any transactions that are going on. EB’s normal method of shutting down a node without interrupting a transaction is to stop sending network requests to that node (re-routing them to another node) then killing it after it hasn’t seen any activity for a while. However, as we wanted to put a worker on the node, this strategy wouldn’t do: a worker doesn’t respond to incoming network requests, so it will keep on working as long as it has any jobs to complete. Instead, we needed to employ a different method to tell the worker that it needs to not take on any more jobs.

The strategy to achieve this is very simple: send the worker a SIGTERM, the worker gets this signal and reacts by continuing its current jobs but not taking on new ones and exiting when it’s done. As a backup, a SIGKILL can be sent to the worker after waiting a while to ensure that it gets shut down even if some jobs are stuck. The standard EB application process is managed by Upstart, but the way this service is handled makes it so that we couldn’t reuse it.¹ No problem, though: we figured we would just create a new Upstart service for our worker. After all, Upstart manages all the things we needed: the ability to send a SIGTERM to a process and then waiting before killing it, as well as the ability to automatically restart a service up to a certain limit of restarts for a given time period.

We found out the hard way that the version of Upstart that is installed on EB’s AMIs is quite old. So old that it doesn’t support any of those core features that we needed.² The usual fallback for meeting those requirements with Upstart, stop-start-daemon, also isn’t available. Instead we turned to trusty System V init scripts, because they can do whatever we want them to do. To be clear, though, you shouldn’t do this if you have access to a recent Upstart or stop-start-daemon. Just use those instead! And really, you shouldn’t be running worker Nodes on EB in the first place!

The init script

Our strategy was twofold: create an init script that starts and stops the process as we want, and create a separate script that acts as the manager of our application and restarts it in the case of a crash. The trickiest part of this was ensuring that the init script always has access to the PID of the worker, which we store in a PID file. The start function of our init script looks like this:

start() {
  echo -n "Starting $prog:"
  . $env
  daemon --user=games $exec >> $log 2>&1 &
  retval=$?
  echo
  pid=`cat $pidfile`
  [ $retval -eq 0 ] && touch $lockfile && echo "[Started $pid]"
  return $retval
}

I’ll address were env is coming from in a bit, but for now know that it contains the necessary configuration to be able to run the application. We can also see that we’re getting the PID of our process from pidfile, which is getting set by the script that manages the worker, exec. Other than that, the biggest surprise is probably the use of the games user. We chose games as we wanted a non-root user that we had complete control over, and by default games exists yet doesn’t have any other particular use. As it saves us from creating a user, games became the user that owns our worker.

stop is where things start getting fun:

stop() {
  kill_timeout=$default_kill_timeout
  if [ -f $env ]; then
    . $env
    if [ -n "$WORKER_SHUTDOWN_DELAY" ] &&
       [ "$WORKER_SHUTDOWN_DELAY" -gt $min_kill_timeout ]; then
      kill_timeout="$WORKER_SHUTDOWN_DELAY"
    fi
  fi
  pid=`cat $pidfile`
  i="0"
  echo "Stopping $prog [$pid]:"
  kill $pid
  retval=$?
  while [ $i -lt $kill_timeout ]; do
    if [ -d "/proc/$pid" ]; then
      sleep 1
      i=$[$i+1]
    else
      rm -f $lockfile
      [ $retval -eq 0 ] && echo "[Stopped]"
      return $retval
    fi
  done
  echo "$prog did not shut down in the grace period, sending sigkill"
  pkill -9 $prog
  retval=$?
  rm -f $lockfile
  [ $retval -eq 0 ] && echo "[Stopped]"
  return $retval
}

Here we’re again using env, this time to get the WORKER_SHUTDOWN_DELAY variable, making it so that we can configure this script from EB’s console. After working out what our kill_timeout (the time that we wait before completely killing the worker) is, we then send a SIGTERM to the process found in pidfile (SIGTERM is the default signal sent by kill). Then we wait for the process to be done, which we base on the existence of the /proc/$pid directory. Either the process finishes in the given amount of time, or we reach the end of the loop and send a SIGKILL to the worker’s manager, prog. The rest of our init script is more or less stock, but if you want to see it, you can check out the complete example at the end.

The worker’s manager

The worker’s manager is responsible for restarting the worker when it crashes. We want to limit the number of restarts in a given time period, however, as it’s possible for a worker to get into a crash loop that, if sustained, would be unhelpful (and send me a lot of alerts in the process). It looks like so:

#!/bin/sh
restartLimit=10
timeLimit=300 # 5 minutes
crashCount=0
cd /var/app/current/
while true; do
  node services/queue/worker.js &
  pid=$!
  echo $pid > /tmp/worker.pid
  wait $pid
  retval=$?
  if [ $retval -eq 0 ]; then
    exit 0
  else
    # Crash
    crashTime=`date +%s`
    if [ -z "$firstCrashTime" ]; then
      firstCrashTime=$crashTime
    fi
    timeDiff="$[$crashTime - $firstCrashTime]"

    if [ $timeDiff -lt $timeLimit ]; then
      crashCount=$[$crashCount + 1]
      if [ $crashCount -gt $restartLimit ]; then
        echo "Too many crashes, worker stopped"
        exit $retval
      fi
    else
      crashCount=1
      firstCrashTime=$crashTime
    fi

    echo "Worker crashed, restarting"
  fi
done

This script starts our worker (node services/queue/worker.js &), stashes its PID in the PID file, then waits for it to end. If it ends returning a value of 0 it shuts down nicely, otherwise we know there’s been a crash. In the crash situation, we end up starting the loop over again unless there have been too many crashes in a short period of time.

Generating our environment

We’ve seen an environment get used a couple of times in the init script. This is the essential environment that contains both our EB configuration (i.e. the environment variables that we pass into EB) as well as the path required to start our application. The way we generate this is less than elegant:

#!/bin/bash
NODE_INSTALL_DIR=`/opt/elasticbeanstalk/bin/get-config container -k nodejs_install_dir`
NODE_VERSION=`/opt/elasticbeanstalk/bin/get-config optionsettings -n "aws:elasticbeanstalk:container:nodejs" -o "NodeVersion"`
export NODE_PATH="${NODE_INSTALL_DIR}/node-v${NODE_VERSION}-linux-x64"
export PATH="$NODE_PATH/bin:$PATH"

ENVIRONMENT_CONFIG=`/opt/elasticbeanstalk/bin/get-config environment`
ENV_VARS=`echo ${ENVIRONMENT_CONFIG} | sed 's/"\([^"]*\)":"\([^"]*\)",*/\1="\2" /g;s/^{//;s/}$//'`

echo "export $ENV_VARS PATH=\"$PATH\"" > /tmp/env

As you can see, finding our PATH is pretty Node-specific, but it can be extended for other EB platforms. The real tricky bit comes from getting our EB config. We use EB’s get-config environment which returns its result as JSON. The next line is a “simple”³ JSON to shell variable converter, which takes JSON in the form of {"var1":"val1","var2":"val2"...} and converts it to var1="val1" var2="val2". This is clearly not the most robust solution…

After getting our path and config environment, we simply stick it in a file (prepended with export).

Starting and restarting

Now we have a bunch of scripts that do the things we want, we just need to run them at the right time. To this end, we need to have scripts present in three different subdirectories of /opt/elasticbeanstalk/hooks: appdeploy, configdeploy, and restartappserver, which run their contained scripts on the events that they describe. In each of these directories we want a script to generate the environment (which is the script we just talked about) as well as one that starts the init script. The latter takes the following form:

#!/bin/bash
/etc/init.d/worker stop
. /tmp/env
if [ "$NODE_TYPE" != "api" ]; then
  # Start the worker when not a pure API node
  /etc/init.d/worker start
fi

Putting it together

All of these scripts are placed in the files section of a .ebextensions config file, so that EB creates the requisite files on deployment. We use a bit of symlink trickery to avoid repeating the same script more than once in the config file, but otherwise it’s straight-forward. The completed config file spawns workers that can be gracefully shut down, and that get restarted after a crash. One thing to take note of with this arrangement is that you still need to have some sort of server running alongside a worker as the primary EB application in order to pass EB’s health checks. This server could simply return 200 to every request, or it could actually monitor the health of the worker!

If you have ideas for how things could have been done differently, or improved, let me know!

Our service needs different shutdown logic from the EB’s application service. On stopping, the application service simply sends a SIGKILL to all processes owned by a particular user. For instance, for NodeJS applications, EB creates a nodejs user that then starts the application.↩
As of September 2015↩
It’s one line!↩