Choosing the Right In-Memory Storage Solution (Part 1)

Decrease tech debt and scale faster with Elixir. Book a free consult today to learn how we’ve used Elixir to help companies like yours reach success.

Most of the time when you store application data, it will be persisted to disk with a database such as Postgres. However, there are many cases where keeping some or all of the data in memory instead can achieve significant performance gains. In this series, we’ll take a look at some of the methods we have available for in-memory storage with Elixir, as well as which use cases are appropriate for each.

The Solutions

:ets (async)

:ets is a robust database that comes built-in with OTP. For this implementation, we use a table with public access

:ets.new(:my_table, [:set, :named_table, :public])

and read/write directly

:ets.lookup(:my_table, key)
:ets.insert(:my_table, {key, value})

:ets (serialized)

Another option is to use an :ets table with :protected or :private access, which limits writes and/or reads to a single “owner” process (usually a GenServer). The owner then acts as a middleman between the table and the caller process to ensure serializability for writes and/or reads.

defmodule InMemory.ETS do
  use GenServer

  def start_link(_) do
    GenServer.start_link(__MODULE__, [], name: __MODULE__)
  end

  def serialized_read(key) do
    GenServer.call(__MODULE__, {:read, key})
  end

  def serialized_write(key, value) do
    GenServer.cast(__MODULE__, {:write, key, value})
  end

  def init(_) do
    table = :ets.new(:serialized_table, [:set, :private])
    {:ok, table}
  end

  def handle_call({:read, key}, _from, table) do
    {:reply, :ets.lookup(table, key), table}
  end

  def handle_cast({:write, key, value}, table) do
    :ets.insert(table, {key, value})
    {:noreply, table}
  end
end

GenServer

The simplest way of storing data in-memory is to organize it as a Map, and store that Map as the state of a GenServer. We can just implement a basic interface such as get + put and be ready to go!

defmodule InMemory.MyGenServer do
  use GenServer

  def start_link(args \\ []) do
    GenServer.start_link(__MODULE__, args, name: __MODULE__)
  end

  def get(key), do: GenServer.call(__MODULE__, {:get, key})

  def put(key, value), do: GenServer.cast(__MODULE__, {:put, key, value})

  def init(_) do
    {:ok, %{}}
  end

  def handle_call({:get, key}, _from, state) do
    {:reply, Map.fetch!(state, key), state}
  end

  def handle_cast({:put, key, value}, state) do
    {:noreply, Map.put(state, key, value)}
  end
end

Note: Agent is also an option here if you’d like to use its provided interface instead of writing GenServer callbacks

Module

For the most complex approach, we can use metaprogramming to create an Elixir Module dynamically which contains a Map of data and a single function for lookups.

def create_dynamic_module(data \\ %{}) do
  ast =
    quote do
      defmodule MyDynamicModule do
        def state, do: unquote(Macro.escape(data))

        def state(id), do: Map.get(unquote(Macro.escape(data)), id)
      end
    end

  [{MyDynamicModule, _}] = Code.compile_quoted(ast, "nofile")
  {:module, MyDynamicModule} = Code.ensure_loaded(MyDynamicModule)
end

Reads are quick and simple:

value = MyDynamicModule.state(key)

Note: You might be wondering if this approach can be improved by storing each row as its own function in the Module. While such an implementation is possible, it actually turns out to be less performant because the time it takes to construct the name of the function each time, compared to the very fast read times, is non-negligible. Using the unaltered primary key as the function name is often not possible due to Elixir function naming limitations. Therefore, the single-function Module approach (shown above) is best in almost all cases.

Making updates with this approach requires fetching the full state, performing the update, deleting the dynamic module and re-creating it with the updated data:

def put(key, value) do
  data = MyDynamicModule.state()
  updated = Map.put(data, key, value)

  :code.delete(MyDynamicModule)
  :code.purge(MyDynamicModule)

  create_dynamic_module(updated)
end

:persistent_term

:persistent_term is a relatively new feature that is built-in with OTP. Its introduction came around the same time as the aforementioned dynamic module approach was gaining popularity in the community, so it’s likely that under-the-hood the two approaches are similar. However, :persistent_term can offer additional guarantees with its lower-level implementation, making use of internal BEAM features.

According to the documentation, :persistent_term is

similar to ets in that it provides a storage for Erlang terms that can be accessed in constant time, but with the difference that persistent_term has been highly optimized for reading terms at the expense of writing and updating terms

and

suitable for storing Erlang terms that are frequently accessed but never or infrequently updated

We’ll put our data into a Map and store that map as a persistent term.

:persistent_term.put(:my_data, data)

Reads use a simple get/1 call, while updates are slightly more complex:

# Read
:persistent_term.get(:my_data)[key]

# Update
def put(key, value) do
  data = :persistent_term.get(:my_data)
  updated = Map.put(data, key, value)
  :persistent_term.put(:my_data, updated)
end

Read on to Part 2 of this series, where we’ll benchmark and compare read times for all of the above solutions!