In part 1, we looked at the mechanics of how Stream.unfold/2
works. Don’t worry; we’ll be digging into other Stream
methods. But let’s take a break with a simple look at Erlang term ordering. Take a peek at this page and you will see this referenced in lots of places. Before we jump into code examples, I want to first evaluate some finer linguistic details.
Erlang Term Ordering
Term is a keyword I see throughout the Elixir community – Erlang term ordering, Erlang Term Storage (ETS), and Disk-Based Term Storage are a few examples. In Erlang, an expression like sub_exp1 + sub_exp2
are made up of two terms. A term can be an integer, float, atom, string, list, map, or tuple. Term comparison operators also include ==
, /=
, >
, and many more that you are probably familiar with.
Can we compare across data types? There is a pecking order.
number < atom < reference < fun < port < pid < tuple < list < bit-string
iex> {1, 2} > %{:a => 1}
false
We can scope this idea of Erlang term ordering to a few data types. That simplifies things. Except, there is no exclusive string data type in Erlang. I was surprised as well.
We are ready to look at some examples where term ordering comes into play. Here is a basic tuple sort.
iex> Enum.sort([{2, 1}, {1, 2}])
[{1, 2}, {2, 1}]
What about tuples of various lengths?
iex> coordinates = [{1, 1}, {0, 0, 0}, {1, 6}]
iex> Enum.sort(coordinates)
[{1, 1}, {1, 6}, {0, 0, 0}]
What if we need to reverse it? We can take advantage of Erlang term ordering explicity and use Enum.sort/2
.
iex> coordinates = [{1, 1}, {0, 0, 0}, {1, 6}]
iex> Enum.sort_by(coordinates, &(&1 > &2))
[{0, 0, 0}, {1, 6}, {1, 1}]
This leads us to one conclusion about tuple sizes.
iex> {0, 0, 0} > {0, 0}
true
What about maps?
iex> coordinates = [%{:b => 1}, %{:a => 0, :c => 0}]
iex> Enum.sort(coordinates)
[%{b: 1}, %{a: 0, c: 0}]
iex> coordinates = [%{:b => 1, :a => 0}, %{:a => 1}]
iex> Enum.sort(coordinates)
[%{a: 1}, %{a: 0, b: 1}]
This implies map sizes matter
iex> %{:a => 1, :b => 1} > %{:a => 1}
true
and key order matters
iex> %{:b => 1, :a => 0} > %{:a => 0, :b => 1}
true
And now a binary
iex> vowels = "aeiou"
iex> vowels? = "aeyou"
iex> vowels > vowels?
false
We can see a pattern emerging that can be solidified into various maxims on how Erlang term ordering works.
-
Lists are compared element by element.
-
Tuples are ordered by size, two tuples with the same size are compared element by element.
-
Maps are ordered by size; two maps with the same size are compared by keys in ascending term order and then by values in key order. In maps, key order integers types are considered less than floats types.
-
When comparing an integer to a float, the term with the lesser precision is converted into the type of the other term, unless the operator is one of =:= or =/=. A float is more precise than an integer until all significant figures of the float are to the left of the decimal point. The conversion strategy is changed depending on the size of the float, because otherwise, comparison of large floats and integers would lose their transitivity.
-
Bitstrings are compared byte by byte, incomplete bytes are compared bit by bit.
Let’s look at one more function before we move on to make sure this knowledge is solidified. min_max/2
is defined as returning a tuple with the minimal and the maximal elements in the enumerable according to Erlang’s term ordering.
Ordering based on ASCII value.
iex> coordinates = [:a, :z, :b]
iex> Enum.min_max(coordinates)
{:a, :z}
Ordering based on map size.
iex> coordinates = [%{:a => 1}, %{:a => 1, :b => 2, :c => 3}, %{:a => 1, :b => 2}]
iex> Enum.min_max(coordinates)
{%{a: 1}, %{a: 1, b: 2, c: 3}}
Our knowledge is cross functional. I hope this is getting easier.
We can also take a quick look at hairier examples with Date
and DateTime
. To our comparison operators above, Date
and DateTime
is just a map.
iex> early_january = ~D[2019-01-02]
iex> early_february = ~D[2019-02-01]
iex> early_february > early_january
false WAT?!?!
Well that is weird. However, as stated in the docs:
Remember, comparisons in Elixir using ==/2, >/2, </2 and friends are structural and based on the DateTime struct fields.
If we check i early_february
and i early_january
, it may make sense that comparison fails, since we learned above how maps will eventually compare each key by order.
# ~D[2019-01-02]
Raw representation
%Date{calendar: Calendar.ISO, day: 2, month: 1, year: 2019}
# ~D[2019-02-01]
Raw representation
%Date{calendar: Calendar.ISO, day: 1, month: 2, year: 2019}
Another interesting tidbit: In the Erlang VM, atoms can have a maximum of 255 characters. On the other hand, integers in Erlang/Elixir uses bignum arithmetic. As such, integers are only limited by the memory available on the system. Consequently, I can always give you a number higher or lower than what you stated the integer is: infinity + 1 > infinity
!
That’s It
Before I dived into writing this article, I had taken Erlang term ordering as a given and hopelessly applied it without evaluating what it really means. I hope some of the finer details help solidify your knowledge of Erlang terms
and how they can be used with comparison operators in Elixir.
In part 3, we will jump back to the Stream
module and look at the mechanics of how methods like Stream.map/2
create composable, lazy enumerables differently than in part 1.
DockYard is a digital product agency offering exceptional strategy, design, full stack engineering, web app development, custom software, Ember, Elixir, and Phoenix services, consulting, and training. With a nationwide staff, we’ve got consultants in key markets across the U.S., including Portland, Los Angeles, Salt Lake City, Minneapolis, Miami, Washington D.C., and Boston.