Elixir Self Discovering Cluster

Mar 10 2016

Distributed systems are by nature deployed on multiple machines. In
development mode though we would rather set up such an environment on a
single machine for convenience. In this post we demonstrate how to set
up a dynamic cluster on a single development machine.

We'll use Elixir to create a node in a
cluster. Elixir is a dynamic, functional language designed for building
distributed low-latency and fault-tolerant systems.

Docker will be used to deploy our nodes. Docker
containers wrap up a piece of software in a complete file-system that
contains everything it needs to run: code, runtime, system tools, system
libraries – anything you can install on a server. This guarantees that
it will always run the same, regardless of the environment it is running
in.

A Basic Node

A node will have only 2 tasks:
 - Seek other nodes.
 - Write to log file.

Our cluster requires some kind of cross-machine node registry so that
nodes can discover one another In this example, we'll use the host's
local file system, but this can easily implemented for production apps
using something like Amazon S3, PostgreSQL or any other storage.

At this stage, we'll assume that each node starts with the Erlang
long-name [email protected]_ip.

Every node will register itself at the cluster registry by creating a
file under it's name. It will also keep a timestamp so we'll know when
the node was last active.

  @sync_dir "/tmp/sync_dir/"

  def sign_as_active_node do
    File.mkdir_p @sync_dir
    {:ok, file} = File.open(path, [:write])
    IO.binwrite file, time_now_as_string
    File.close file
  end

  def path do
    @sync_dir <> Atom.to_string(Node.self)
  end

In order to find other registered nodes, each node will try to ping the
others listed at that folder.

  def check_active_nodes do
    active_nodes
      |> Enum.map(&(String.to_atom &1))
      |> Enum.map(&({&1, Node.ping(&1) == :pong}))
  end

  def active_nodes do
    {:ok, active_members} = File.ls(@sync_dir)
    active_members
  end

We'll bundle it all into a simple recursive main loop that is started at
boot time.
And... that's it. We have a distributed system with a pulse check!

  def loop do
    sign_as_active_node
    status = inspect check_active_nodes
    Logger.info(Atom.to_string(Node.self) <> status)
    :timer.sleep(@interval)
    loop
  end

This is a very basic setup in a mesh topology. We can expand it to other
more advanced distribution architectures of course.

Lets see the nodes in action. We'll start some nodes with a unique
longname & predefined cookie.

  $> elixir --name [email protected] --cookie cookie -S mix
  $> elixir --name [email protected] --cookie cookie -S mix
  $> cat /tmp/log/cluster.log

  18:51:29.277 [info] [email protected]["[email protected]": true, "[email protected]": true]
  18:51:29.974 [info] [email protected]["[email protected]": true, "[email protected]": true]
  18:51:30.281 [info] [email protected]["[email protected]": true, "[email protected]": true]

"Dockering" the Node

Docker builds images automatically by reading the instructions from a
Dockerfile. Lets create one.

We'll use an clean Ubuntu 14.04 image

FROM ubuntu:14.04

Install Elixir and add our app.

# Install Erlang
RUN apt-get update && apt-get install -y \
    wget \
    curl \
    unzip \
    libwxbase3.0 \
    libwxgtk2.8-0 \
    build-essential
RUN wget https://packages.erlang-solutions.com/erlang/esl-erlang/FLAVOUR_1_general/esl-erlang_18.2-1~ubuntu~precise_amd64.deb
RUN dpkg -i esl-erlang_18.2-1~ubuntu~precise_amd64.deb && apt-get install -f

# Install Elixir
RUN wget https://github.com/elixir-lang/elixir/releases/download/v1.2.1/Precompiled.zip
RUN mkdir -p /usr/local/elixir
RUN cd /usr/local/elixir && unzip /Precompiled

# Install hex & rebar
RUN bash -c "mix local.hex <<< 'Y'"
RUN bash -c "mix local.rebar <<< 'Y'"

Add our app to the container

ADD . /app
WORKDIR /app

RUN mix deps.get
RUN mix compile

We want the app to start by default when the container boots

CMD ./run.sh

The run.sh script

str=`date -Ins | md5sum`
name=${str:0:10}

elixir --name [email protected]$127.0.0.1 --cookie cookie -S mix

Now we can run nodes in containers.

$> docker build -t spectory/iex_cluster .
$> docker run -it spectory/iex_cluster

08:12:56.821 [info]  [email protected]["[email protected]": true]
08:12:57.824 [info]  [email protected]["[email protected]": true]

Even though we can deploy multiple nodes, they won't form a cluster because each container is isolated.

Deploying a cluster

When deploying production apps to cloud services such as AWS or Gcloud,
we typically create a network, and run our system under a subnet. We
want to simulate such setup too and run all our nodes under the same
subnet.

Luckily, the good guys of Docker supplied us with an easy setup for just that - Docker-Compose.

Docker Compose is a tool for defining and running multi-container
Docker applications. With Compose, you use a single file to configure
all of the app’s services. Then using a single command, you create and
start all the services from your configuration.

Here is our docker-compose file:

version: '2'

services:
  node:
    build:
      context: ./
    volumes:
    - /tmp:/tmp

As we mentioned above, we use the host's file-system as our nodes
registry. Thats why we share the /tmp folder between all our nodes & the
host by using the volume option.

Thats it.  Docker-Compose will create a network, and each container that
boots will get an IP address on that network.

If you think that was easy, see how we deploy a cluster...

We need each node to register under it' IP address, so we'll edit our run.sh

ip=`ip a | grep global | grep -oE '((1?[0-9][0-9]?|2[0-4][0-9]|25[0-5])\.){3}(1?[0-9][0-9]?|2[0-4][0-9]|25[0-5])'`
str=`date -Ins | md5sum`
name=${str:0:10}

elixir --name [email protected]$ip --cookie cookie -S mix

And now.....

$> docker-compose scale node=5
node_2 | 08:27:40.557 [info]  [email protected]["[email protected]": true, "[email protected]": true, "[email protected]": true, "[email protected]": true, "[email protected]": true]
node_3 | 08:27:40.567 [info]  [email protected]["[email protected]": true, "[email protected]": true, "[email protected]": true, "[email protected]": true, "[email protected]": true]
node_1 | 08:27:40.717 [info]  [email protected]["[email protected]": true, "[email protected]": true, "[email protected]": true, "[email protected]": true, "[email protected]72.19.0.3": true]
node_5 | 08:27:41.224 [info]  [email protected]["[email protected]": true, "[email protected]": true, "[email protected]": true, "[email protected]": true, "[email protected]": true]
node_4 | 08:27:41.382 [info]  [email protected]["[email protected]": true, "[email protected]": true, "[email protected]": true, "[email protected]": true, "[email protected]": true]

Presto... a self-discovering cluster.

Summery

We've create our node using Elixir, a single node can now do stuff that
are developed in Elixir or run processes written in some other language.

We've wrapped our node in a docker container. This made our node very
portable. We can easily deploy our nodes not only on our dev
environment, but also to production. We just need to make sure the
containers can access each other at any environment.

We've covered how to simulate a cross machine cluster at a dev
environment using docker-compose, which create a network between
containers. Compose allows us easily to create, start, stop & scale our
cluster.

Happy clustering!

Guy Y.
Software Developer
Back to Blog