Elixir gives your team the built-in tools they need to develop quickly and effectively. Book a free consult to learn how we put Elixir to work for you.
Elixir can call compiled code from languages like C or Rust through Native Implemented Functions (NIFs). This article is a brief introduction to how to connect to your Erlang VM instance with LLDB, as well as a brief cheat sheet of useful LLDB commands.
What is LLDB
LLDB is a debugger program that allows a developer to step through compiled code during its execution, as well as explore stack traces whenever a system exception is reached. Its name is a play on words between Low-Level Virtual Machine (LLVM) and DB (which in contexts means “debugger”). For those interested, the same process applies to GNU Debugger (GDB), albeit with different commands, and there is a command mapping on LLDB’s website.
Pre-Requisites for Following This Guide
You’ll need to have Elixir and Erlang installed (which are most easily installed via asdf
.
This guide used Elixir 1.15.4 and Erlang 26.0.2, but concepts should apply regardless of version.
For building the NIFs, make
and g++
(or an equivalent C++ compiler) should be available. The installation for each of these
differs quite a bit depending on the platform and environment setup, so it may be helpful to look up instructions.
Creating and Understanding the Code
Before we dive into LLDB, let’s prepare our example code. The instructions below
will help us set up a Mix project that provides a natively implemented upcase/1
function. Along the way, we’ll also understand more about how NIFs are compiled
and mapped onto Elixir functions.
First, we’ll generate a standard Mix project with mix new nif_example
.
Then, add the following files inside the project we just generated. Each
file will be followed by its purpose in the setup.
nif_example/Makefile
:
# Environment variables passed via elixir_make
# ERTS_INCLUDE_DIR
# MIX_APP_PATH
CFLAGS= -fPIC -I$(ERTS_INCLUDE_DIR) -Wall -w
ifdef DEBUG
CFLAGS += -g
endif
LDFLAGS = -shared -flat_namespace -undefined suppress
priv/libnifexample.so: cache/objs/my_nif.o
@mkdir -p priv
$(CXX) $^ -o $@ $(LDFLAGS)
cache/objs/%.o: c_src/%.cc
@mkdir -p cache/objs
$(CXX) $(CFLAGS) -c $< -o $@
The Makefile
is how we use make
to build a project.
There is quite a bit to understanding the Makefile, which isn’t really relevant here.
For brevity, we can think of it as the builder script that calls the C++ compiler and builds
priv/libnifexample.so
which is the shared-object .so
library file from where Elixir loads
the NIF implementations.
nif_example/c_src/my_nif.h
:
#pragma once
#include "erl_nif.h"
ERL_NIF_TERM upcase(ErlNifEnv* env, int argc, const ERL_NIF_TERM argv[]);
my_nif.h
is the C++ header file, which declares the upcase
function. In this case,
because there’s only one .cc
file, we could omit it, but for completeness, it has been included.
nif_example/c_src/my_nif.cc
:
#include "my_nif.h"
ERL_NIF_TERM upcase(ErlNifEnv* env, int argc, const ERL_NIF_TERM argv[]) {
if (argc != 1) {
return enif_make_tuple2(env, enif_make_atom(env, "error"), enif_make_atom(env, "invalid_arg_count"));
}
ErlNifBinary bin;
if (!enif_inspect_binary(env, argv[0], &bin)) {
return enif_make_tuple2(env, enif_make_atom(env, "error"), enif_make_atom(env, "invalid_argument"));
}
ErlNifBinary out_bin;
// The call below is commented out to introduce an artificial
// segmentation fault error source.
// enif_alloc_binary(bin.size, &out_bin);
for (int i = 0; i < bin.size; i++) {
char c = bin.data[i];
if (c >= 'a' && c <= 'z') {
out_bin.data[i] = c - 'a' + 'A';
} else {
out_bin.data[i] = c;
}
}
return enif_make_tuple2(env, enif_make_atom(env, "ok"), enif_make_binary(env, &out_bin));
}
static int load(ErlNifEnv* env, void** priv_data, ERL_NIF_TERM load_info) {
return 0;
}
static ErlNifFunc nif_funcs[] = {
{"upcase", 1, upcase}};
ERL_NIF_INIT(Elixir.NifExample, nif_funcs, &load, NULL, NULL, NULL);
my_nif.cc
is the main native source file. It includes the upcase
function, and the declaration of the NIF functions the library provides.
This declaration is done through ERL_NIF_INIT
, which ultimately maps an Elixir name and arity to the corresponding C++ function.
As a brief introduction to the native side of NIFs, let’s understand the three arguments that all NIFs receive: env
, argc
and argv
.
A more thorough explanation is found in the official docs.
ErlNifEnv env
is a special data structure that contains an environment that acts as the communication pipeline between the native side and
the Erlang/Elixir side of the execution workflow.
ERL_NIF_TERM[] argv
is a list of ERL_NIF_TERM
values, which is how the native side gets any value passed as an argument. Generally, these are
decoded by special functions, and all NIFs also return an ERL_NIF_TERM
value. int argc
is the length of this list, an artifact of how arrays are
implemented in C. This pair is analogous to the main(int argc, char **argv)
usually present in C/C++.
nif_example/lib/nif_example.ex
:
defmodule NifExample do
@on_load :__on_load__
def __on_load__ do
# We refer to `:code.priv_dir` indirectly because at runtime, the `priv` dir
# is not necessarily in the same path as `./priv` -- as an exercise,
# try running it via `iex -S mix` and check the returned path!
path = :filename.join(:code.priv_dir(:nif_example), ~c"libnifexample")
:erlang.load_nif(path, 0)
end
def upcase(_binary), do: :erlang.nif_error(:undef)
end
The NifExample
Elixir module is where we declare the NIF functions (or function, in this case), using the @on_load
module callback
to call :erlang.load_nif
and load the declarations. Stub implementations must be provided so that the Elixir compiler knows which functions
to connect to the NIF declarations.
And finally, let’s add the following to the projects mix.exs
:
...
def project do
[
...
compilers: [:elixir_make | Mix.compilers()]
...
]
end
...
def deps do
[
{:elixir_make, "~> 0.8"}
]
end
...
Finally, we must compile with DEBUG=1 mix compile
first, or export DEBUG=1
before compiling, so that the Makefile includes the -g
flag. -g
will add more info
to the produced .so
file, which allows lldb
to reference function names and file locations
for each symbol, ultimately enabling a more complete debugging experience.
Now that the project is ready, we can run NifExample.upcase("my string")
in IEx to get out upcased string. However, the
error we introduced in the code will cause us to get a segmentation fault, which crashes the VM,
leaving no room for us to debug in Elixir-land.
Enter LLDB
LLDB, when attached to a process, can capture code backtraces, which are similar to Elixir’s stack traces. This means that we can figure out at least where our code is going wrong in NIF land.
To attach LLDB to a BEAM instance, we first start our IEx shell and use System.pid
to obtain the host OS PID for the BEAM instance. Let’s say it outputs 133742
.
In a separate shell, we can then use sudo lldb --attach-pid 133742
to run LLDB
and attach it to the BEAM process.
Notice that as soon as LLDB is done setting up, the BEAM process is frozen. This is
because the debugger puts the process on hold until we use the continue
command, which
resumes execution.
Now, if we again call NifExample.upcase("my string")
, we`ll see that LLDB reacts to the crash,
with a dump like this one:
(lldb) continue
Process 72797 resuming
Process 72797 stopped
* thread #13, name = 'erts_sched_9', stop reason = EXC_BAD_ACCESS (code=1, address=0xb5d5b)
frame #0: 0x00000001005a2168 beam.smp`erts_build_proc_bin + 8
beam.smp`erts_build_proc_bin:
-> 0x1005a2168 <+8>: ldr x8, [x2, #0x10]
0x1005a216c <+12>: str x8, [x1, #0x8]
0x1005a2170 <+16>: ldr x8, [x0]
0x1005a2174 <+20>: str x8, [x1, #0x10]
Target 0: (beam.smp) stopped.
We can use bt
to get a more complete backtrace, which, as seen below, points us to my_nif.cc:25:59
.
(lldb) bt
* thread #13, name = 'erts_sched_9', stop reason = EXC_BAD_ACCESS (code=1, address=0xb5d5b)
* frame #0: 0x00000001005a2168 beam.smp`erts_build_proc_bin + 8
frame #1: 0x0000000100615efc beam.smp`enif_make_binary + 660
frame #2: 0x0000000103703e30 libnifexample.so`upcase(env=0x000000017049ad38, argc=1, argv=0x000000017049ae40) at my_nif.cc:25:59
frame #3: 0x000000010048fb3c beam.smp`beam_jit_call_nif(process*, void const*, unsigned long*, unsigned long (*)(enif_environment_t*, int, unsigned long*), erl_module_nif*) + 100
frame #4: 0x00000001015b0afc
This means the error happens on line 25, column 59 - the call to enif_make_binary
.
In this case, after some investigation and documentation diving, we can conclude that the issue is that
we’re operating on an unallocated binary. We can also conclude that more easily because we manually
introduced the error by commenting out the enif_alloc_binary
, as called out beforehand.
Breakpoint Debugging with LLDB
For the sake of an example, let’s say we couldn’t find the error. First, let’s restart everything.
We can use exit
or CTRL-D
to exit the LLDB shell at any time. We can then repeat the lldb --attach-pid
command
with the new PID. Another option is to use detach
on the LLDB shell, restart the BEAM, get the new PID, and attach <pid>
, again in the LLDB shell.
Now, before running the Elixir code again, let’s set the breakpoint in LLDB with b my_nif.cc:25
.
We can use frame variable
to get the available variables and their values, and either continue
to step through
to the next breakpoint, or n
/s
to step through the code – n
steps over function calls,
while s
steps inside function calls. Running help
inside the lldb shell is a great way of finding what each command does
and discovering new commands. help <command>
explains each one in more detail.
With frame variable
, we get the output below:
(ErlNifEnv *) env = 0x000000016d3d2d38
(int) argc = 1
(const ERL_NIF_TERM *) argv = 0x000000016d3d2e40
(ErlNifBinary) bin = {
size = 4
data = 0x0000000140b34020 "ASDF"
ref_bin = 0x0000000000000000
__spare__ = ([0] = 0x00000001030e4a08, [1] = 0x0000000140ca0378)
}
(ErlNifBinary) out_bin = {
size = 6127693120
data = 0x0000000140d60b30 "ASDF"
ref_bin = 0x00000000000b5d4b
__spare__ = ([0] = 0x000000010750c462, [1] = 0x0000000000000002)
}
This output shows us that out_bin
, although with the proper data, is allocated with an incorrect size.
This is because enif_alloc_binary
properly initializes a variable. We can detach
from the process,
fix the code, and then attach
to a new BEAM instance. LLDB will still have the breakpoint, so we can just
run as normal, and get the new frame variable
output with the corrected out_bin
variable:
(lldb) frame variable
(ErlNifEnv *) env = 0x000000016dbe6d38
(int) argc = 1
(const ERL_NIF_TERM *) argv = 0x000000016dbe6e40
(ErlNifBinary) bin = {
size = 4
data = 0x0000000106c655d0 "ASDF\U00000001"
ref_bin = 0x0000000000000000
__spare__ = ([0] = 0x0000000102af8a08, [1] = 0x0000000130cc0668)
}
(ErlNifBinary) out_bin = {
size = 4
data = 0x0000000106c65610 "ASDF"
ref_bin = 0x0000000106c655f8
__spare__ = ([0] = 0x0000000105c9b922, [1] = 0x0000000000000002)
}
Another useful command is p <expr>
which allows us to call simple C code in context, which can also be helpful
to inspect nested values such as p out_bin.size
or p bin.data
in the code at hand:
(lldb) p bin.data
(unsigned char *) 0x0000000106c655d0 "ASDF\U00000001"
(lldb) p out_bin.size
(size_t) 4
The fixed code now behaves as expected:
iex(1)> NifExample.upcase("asdf")
{:ok, "ASDF"}
iex(2)> NifExample.upcase("ASDF")
{:ok, "ASDF"}
iex(3)> NifExample.upcase(1)
{:error, :invalid_argument}