My mental model of Postgres’s transactions was that it used MVCC (Multiversion Concurrency Control): it kept around multiple copies of rows and transactions automatically kept track of which version they used on their first read or write. When it was time to COMMIT
the transaction, the server checked if that starting number matched the row’s current, outside-of-transactions number and if it did for all modified rows, then the transaction succeeded, if not, then it rolled back. There was no need for user supplied locks and the server internally didn’t use locks either.
Then while working on the indexer for POA Explorer, we started to get DBConnection.ConnectionError
during our batch inserts.
[error] Task #PID<0.7577.0> started from #PID<0.7568.0> terminating
** (stop) exited in: GenServer.call(Explorer.Indexer.BlockImporter, {:import, %{blocks: [...
[error] Postgrex.Protocol (#PID<0.449.0>) disconnected:
** (DBConnection.ConnectionError) client #PID<0.7547.0> exited: exited in:
GenServer.call(Explorer.Indexer.BlockImporter, {:import, %{blocks: [...
Checking the log for Postgres (/usr/local/var/log/postgres.log
for PostgreSQL >= 9.6 for homebrew), we saw
ERROR: deadlock detected
DETAIL: Process 2523 waits for ShareLock on transaction 1080737; blocked by process 2527.
Process 2527 waits for ShareLock on transaction 1080736; blocked by process 2523.
Process 2523:
INSERT INTO "addresses" AS a0 ("hash","inserted_at","updated_at") VALUES ($1,$2,$3),($4,$5,$6),...
Process 2527:
INSERT INTO "addresses" ("balance_fetched_at","fetched_balance","hash","inserted_at","updated_at")
VALUES ($1,$2,$3,$4,$5),($6,$7,$8,$9,$10),...
HINT: See server log for query details.
CONTEXT: while inserting index tuple (9697,48) in relation "addresses"
deadlock
? ShareLock
?! Why are there locks in MVCC!?
When we first hit these we tried to serialize through a GenServer because we had multiple Task
s doing batch inserts. This cleared them up and we weren’t too worried about impacting the overall indexing time because with the batch inserts we weren’t bound by the insert speed, but the speed we could pull data from Parity.
Then we made another set of Task
s that also needed to write to the addresses
table so we could fetch the current balance asynchronously as soon as we first saw an address on-chain. This caused the deadlocks to occur again. While part of the team went to move these task’s writes into a single serialization with the main indexer’s writes, I kept googling, trying to understand why Postgres, whose MVCC is always contrasted with locking-databases like MySQL and MSSQL, was talking about locks at all. Eventually, after reading a few blog posts, I was able to find an explanation of ShareLock
.
ShareLock
s are row locks that are taken on each row as it is written by the transaction. ShareLock
deadlock
s occurs when two separate transactions try to write the same rows in opposite order. And then, it clicked. It’s the classic mutex example of AB vs BA locking. I thought to myself, “I use transactions and Elixir, so I don’t have to deal with this low-level locking complexity”, but that didn’t mean I didn’t know how to fix it. I’ve had to deal with mutexes in C, Java, and Ruby before.
Fixing the ShareLock
deadlock required sorting the inserts to ensure that the implicit ShareLock
s taken when each row in the batch was written would always be in the same order. For addresses
, I used hash
because that was the unique field for the table. It doesn’t matter that hash
doesn’t make sense as an ordering to humans, it just matters that it is consistent.
defp sort_address_changes_list(changes_list) do
Enum.sort_by(changes_list, & &1.hash)
end
– From Explorer.Chain.sort_address_changes_list/1
To ensure the two sets of Task
s used this consistent ordering, sort_address_changes_list
was used before the Ecto.Repo.insert_all/3
calls.
defp insert_addresses(changes_list, named_arguments) when is_list(changes_list) and is_list(named_arguments) do
timestamps = Keyword.fetch!(named_arguments, :timestamps)
timeout = Keyword.fetch!(named_arguments, :timeout)
# order so that row ShareLocks are grabbed in a consistent order
ordered_changes_list = sort_address_changes_list(changes_list)
insert_changes_list(
ordered_changes_list,
conflict_target: :hash,
on_conflict: [set: [balance_fetched_at: nil]],
for: Address,
timeout: timeout,
timestamps: timestamps
)
{:ok, for(changes <- ordered_changes_list, do: changes.hash)}
end
– From Explorer.Chain.insert_addresses/2
def update_balances(balances, options \\ []) when is_list(options) do
timestamps = timestamps()
changes_list =
for {hash_string, amount} <- balances do
{:ok, truncated_hash} = Explorer.Chain.Hash.Truncated.cast(hash_string)
{:ok, wei} = Wei.cast(amount)
Map.merge(timestamps, %{
hash: truncated_hash,
fetched_balance: wei,
balance_fetched_at: timestamps.updated_at
})
end
# order so that row ShareLocks are grabbed in a consistent order.
# MUST match order used in `insert_addresses/2`
ordered_changes_list = sort_address_changes_list(changes_list)
{_, _} =
Repo.safe_insert_all(
Address,
ordered_changes_list,
conflict_target: :hash,
on_conflict: :replace_all,
timeout: Keyword.get(options, :timeout, @insert_addresses_timeout)
)
:ok
end
– From Explorer.Chain.update_balances/2
Importantly, because (1) this bug was difficult to figure out and (2) the code to do the sorting isn’t obviously necessary, I made sure to leave a comment in each place it is used. Don’t be afraid of commenting your code when it is necessary to explain the non-obvious or to prevent bug regressions.
While reading the PostgreSQL Explicit Locking documentation, I was able to figure out that when deadlocks occurred, it wasn’t killing both transactions, but one, allowing the other to advance:
Transaction one attempts to acquire a row-level lock on the specified row, but it cannot: transaction two already holds such a lock. So it waits for transaction two to complete. Thus, transaction one is blocked on transaction two, and transaction two is blocked on transaction one: a deadlock condition. PostgreSQL will detect this situation and abort one of the transactions.
So, we could have rescue
d the Postgres.ConnectionError
and just retried the connections as an alternative strategy. This wouldn’t be a great strategy:
Postgres.ConnectionError
is non-specific- The batch inserts could take significant time, so one
Task
could stall out as it is continually blocked and retries.
By using a consistent sorting for our batch inserts across all write paths for the same table, we were able to avoid deadlocks and not have to retry transactions.