While looking at a disassembled View module
defmodule EExTestWeb.UserView do
use EExTestWeb, :view
end
in IntelliJ Elixir v7.3.0’s BEAM Chunk Code tab
I thought something was wrong with the string inliner because all the bs_match_string
calls seemed to be missing the prefix letter of the template names: 'orm.html'
instead of 'form.html'
, 'ndex.html'
instead of 'index.html'
, etc. Since each bs_match_string
was directly after a label, I looked back up to find the jump that came to those labels: it’s the select_val
call in label(22)
’s block. The select_val
’s value_to_label
argument maps integers to labels. When I first saw this, I thought the numbers were some weird bitpacking of bitstring match states, but looking at an ASCII table
Matching those to the labels, the first character of each template name is being used in the select_val
!
Decimal | Char | Label | Suffix |
---|---|---|---|
102 | f | 23 | orm.html |
105 | i | 24 | ndex.html |
115 | s | 25 | how.html |
110 | n | 26 | ew.html |
101 | e | 27 | dit.html |
The labels that are jumped to from the select_val
match then are checking that the rest of the string matches.
The select_val
is being generated in label(28)
’s block because there is a group of clauses for render_template
generated in Phoenix.Template.__before_compile__
My theory is that this is how BEAM makes pattern matching efficient in general: it finds prefixes that can be matched against with select_val
. As a corollary if you don’t see a select_val
in your bytecode, then you’re not getting the most-efficient pattern matching branching as the pattern checks have to be implemented as separate bytecode instructions, and it might make sense to rearrange clauses that don’t affect semantics to see if a select_val
pops out in generated bytecode. I’m not sure if the compiler has an optimization pass enough to prove clauses can be rearranged and get select_val
s when they’re for clauses not immediately next to each other in the code as written.
Maps
If the bs_match_string
doesn’t match the suffix of known template names, it keeps referring to label(28) as the fail_label
. label(28)
gives up on the arguments being a string and instead checks for map arguments.
Looking at the Elixir in Phoenix.Template
, x(1)
is being used in because it’s the second argument in these two clauses which appear to match in order with label(28)
and label(29)
The compiler is using the efficient is_tagged_tuple
bytecode to check __MODULE__
in {__MODULE__, template}
, instead of having to grab the value out with get_tuple_element
and then an is_eq_exact
test. is_tagged_tuple
exists to handle the common Either type ({:ok, value} | {:error, reason}
) and records. The is_eq_exact
that does appear in label(28)
‘s block is the check that the two template
parameters on line 204 of the Elixir code have the same value. Since is_eq_exact
only allows a left and right operand, this means that repeating a variable more than twice in a pattern probably requires additional bytecode operations to check, but I haven’t verified that yet.
label(29)
‘s block shows that maps have no equivalent of is_tagged_tuple
: to check a single map key’s value against a constant, you need to get the value into a register with get_map_elements
and then test against the constant with is_eq_exact
. I wonder if a fused operation is_tagged_map
would make checking conn = %Plug.Conn{}
and other struct guard patterns faster?
It also appears from the move
s that the compiler doesn’t trace that x(2)
is already proved to hold EExTestWeb.UserView
, so that it can be used moved to x(0)
for the Template.raise_template_not_found
call. The code as shown starts with
x(0) = template
x(1) = assigns
x(2) = EExTestWeb.UserView
move(source: x(1), destination: x(2))
overwrite EExTestWeb.UserView
x(0) = template
x(1) = assigns
x(2) = assigns
move(source: x(0), destination: x(1))
overwrites the assigns
x(0) = template
x(1) = template
x(2) = assigns
move(source: EExTestWeb.UserView, destination: x(0))
reloads the EExTestWeb.UserView
from the LitT
literal table chunk
x(0) = EExTestWeb.UserView
x(1) = template
x(2) = assigns
What needed to happen was a 3 register rotation, but since there’s no bytecode operation for that, it’s done in 3 moves. Eliminating the re-read of LitT
would require using 4th register (x(3)
) and an extra move
, so as long as 1 LitT
move
is cheaper than 2 register to register moves, this is more efficient.
Checking for efficient pattern matching
If you want to confirm the compiler is walking your pattern matches as efficiently as possible, disassemble the .beam
files and check that the beginning of each function is using select_val
. Repeated usage of the same variable in a pattern are first stored in separated registers before being pairwise checked for equality with is_eq_exact
. Although the VM has hundreds of x
registers, because all calls need the arguments starting at x(0)
, groups of move
s will follow pattern matches to rearrange the pattern registers to their call argument positions.
Additional Resources
If you’d like to learn more about the BEAM format, I used BEAM Wisdom’s BEAM File Format to guide development of the .beam
parser in IntelliJ Elixir, which powers the Decompiler and the BEAM Chunk viewer shown in the above screenshots. Opcodes and operation names were determined from genop.tab and searching the OTP source code for the operation names.