What efficient pattern matching looks like at the bytecode level

While looking at a disassembled View module

defmodule EExTestWeb.UserView do
  use EExTestWeb, :view
end

in IntelliJ Elixir v7.3.0’s BEAM Chunk Code tab

Label 21 through label 27 of the disassembly

I thought something was wrong with the string inliner because all the bs_match_string calls seemed to be missing the prefix letter of the template names: 'orm.html' instead of 'form.html', 'ndex.html' instead of 'index.html', etc. Since each bs_match_string was directly after a label, I looked back up to find the jump that came to those labels: it’s the select_val call in label(22)’s block. The select_val’s value_to_label argument maps integers to labels. When I first saw this, I thought the numbers were some weird bitpacking of bitstring match states, but looking at an ASCII table

ASCII Table

Matching those to the labels, the first character of each template name is being used in the select_val!

Decimal	Char	Label	Suffix
102	f	23	orm.html
105	i	24	ndex.html
115	s	25	how.html
110	n	26	ew.html
101	e	27	dit.html

The labels that are jumped to from the select_val match then are checking that the rest of the string matches.

The select_val is being generated in label(28)’s block because there is a group of clauses for render_template generated in Phoenix.Template.__before_compile__

Phoenix.Template.before_compile

Phoenix.Template.compile

My theory is that this is how BEAM makes pattern matching efficient in general: it finds prefixes that can be matched against with select_val. As a corollary if you don’t see a select_val in your bytecode, then you’re not getting the most-efficient pattern matching branching as the pattern checks have to be implemented as separate bytecode instructions, and it might make sense to rearrange clauses that don’t affect semantics to see if a select_val pops out in generated bytecode. I’m not sure if the compiler has an optimization pass enough to prove clauses can be rearranged and get select_vals when they’re for clauses not immediately next to each other in the code as written.

Maps

If the bs_match_string doesn’t match the suffix of known template names, it keeps referring to label(28) as the fail_label. label(28) gives up on the arguments being a string and instead checks for map arguments.

label(28) and label(29) blocks

Looking at the Elixir in Phoenix.Template, x(1) is being used in because it’s the second argument in these two clauses which appear to match in order with label(28) and label(29)

Phoenix.Template render_template in quote block

The compiler is using the efficient is_tagged_tuple bytecode to check __MODULE__ in {__MODULE__, template}, instead of having to grab the value out with get_tuple_element and then an is_eq_exact test. is_tagged_tuple exists to handle the common Either type ({:ok, value} | {:error, reason}) and records. The is_eq_exact that does appear in label(28)‘s block is the check that the two template parameters on line 204 of the Elixir code have the same value. Since is_eq_exact only allows a left and right operand, this means that repeating a variable more than twice in a pattern probably requires additional bytecode operations to check, but I haven’t verified that yet.

label(29)‘s block shows that maps have no equivalent of is_tagged_tuple: to check a single map key’s value against a constant, you need to get the value into a register with get_map_elements and then test against the constant with is_eq_exact. I wonder if a fused operation is_tagged_map would make checking conn = %Plug.Conn{} and other struct guard patterns faster?

It also appears from the moves that the compiler doesn’t trace that x(2) is already proved to hold EExTestWeb.UserView, so that it can be used moved to x(0) for the Template.raise_template_not_found call. The code as shown starts with

x(0) = template
x(1) = assigns
x(2) = EExTestWeb.UserView

move(source: x(1), destination: x(2)) overwrite EExTestWeb.UserView

x(0) = template
x(1) = assigns
x(2) = assigns

move(source: x(0), destination: x(1)) overwrites the assigns

x(0) = template
x(1) = template
x(2) = assigns

move(source: EExTestWeb.UserView, destination: x(0)) reloads the EExTestWeb.UserView from the LitT literal table chunk

x(0) = EExTestWeb.UserView
x(1) = template
x(2) = assigns

What needed to happen was a 3 register rotation, but since there’s no bytecode operation for that, it’s done in 3 moves. Eliminating the re-read of LitT would require using 4th register (x(3)) and an extra move, so as long as 1 LitT move is cheaper than 2 register to register moves, this is more efficient.

Checking for efficient pattern matching

If you want to confirm the compiler is walking your pattern matches as efficiently as possible, disassemble the .beam files and check that the beginning of each function is using select_val. Repeated usage of the same variable in a pattern are first stored in separated registers before being pairwise checked for equality with is_eq_exact. Although the VM has hundreds of x registers, because all calls need the arguments starting at x(0), groups of moves will follow pattern matches to rearrange the pattern registers to their call argument positions.

Additional Resources

If you’d like to learn more about the BEAM format, I used BEAM Wisdom’s BEAM File Format to guide development of the .beam parser in IntelliJ Elixir, which powers the Decompiler and the BEAM Chunk viewer shown in the above screenshots. Opcodes and operation names were determined from genop.tab and searching the OTP source code for the operation names.

Maps

Checking for efficient pattern matching

Additional Resources

Newsletter

Stay in the Know