Friday, December 3, 2021

Serialization Anomily when using MVCC (The good, the bad & the ugly)

Intro

When working in-depth with databases sooner or later you will encounter transaction isolation levels. If you are new to transaction isolation check out the Postgress transaction isolation documentation because that documentation team did an amazing job!  In this post I'd like to go into more depth on one issue mentioned on that page the serialization anomaly.

Real life example

A picture says more than a thousands words so let's try to create a mental picture by talking about a real-life scenario. For this assume we are creating an application that has to manage financial accounts.

Let's take as a starting point 3 accounts with the following balances:

  • Account A: €20
  • Account B: €30
  • Account C: €40

Now consider 2 transaction:

  • tx1: transfers money (€10) from account A to a account B
  • tx2: transfers money (€10) from account A to C

In order to perform these transactions against our database engine it is required to read the balance of the sender from the balances table , subtract the amount sent and write the new balance. Subsequently it needs to read the balance of the receiver, add the amount and write the receiver's new balance.

When these transactions happen in a non-overlapping mode performing these changes to your data is almost trivial. We can detail the steps as follows: Transaction 1 is in bold and transaction 2 in italics. Then the statements look like:

Begin - RA - WA - RB - WB - end - begin - RA - WA - RC - WC - end
 

So in detail:

  • begin: begins the first transaction
  • RA: reads balance for account A 20 euro
  • WA: write the new balance of 10 Euro for account A
  • RB: reads balance for account B 30 euro
  • WB: write new balance of 40 euro for account B
  • end: ends the first transaction
  • begin: begins the second transaction
  • RA: reads balance for account A 10 euro (since previous change is committed)
  • WA: writes the new balance of 0 euro for account A
  • RC: reads balance for account C 40 euro
  • WC: writes the new balance of 50 euro for account C
  • end: ends the second transaction

These steps should be intuitive and feel familiar as it is the behavior we expect when performing financial transactions in real life.

The reason why this is easy is because concurrency in this example always 1, a single transaction is in the system, so no overlap is taking place.


The interesting world of concurrency

 Just like a puzzle becomes more interesting when getting to more than 1 piece concurrency becomes interesting when having multiple concurrent actors.

A modern database however will have a few tricks up its sleeves to manage concurrency. Locking is probably the best-known mechanism to deal with concurrency. It can for example avoid data corruption due to multiple processes writing the same record at the same time. Locking is very powerful but is generally not the first choice to deal with concurrency because it causes contention which lowers the throughput of your database system. Which brings us to another database trick; Multiversion Concurrency Control, MVCC in short.

MVCC is an implementation where each transaction will get a snapshot of the database state at transaction start. This allows the DB engine to be able to read data without risking a dirty read. So it avoids reading of uncommitted data even when that data is changed by another concurrent process. It also enables repeatable reads. It can however cause Serialization anomalies which would make scenario's possible that don't make sense in real life. In order to understand why this is a problem let's revisit our 2 example transactions. Let's start with our system initialized again on the starting point detailed earlier but this time let's have the transactions executed concurrently such that the different statements within them are interleaved.

Assume the order:

Begin - RA - begin - RA - WA - RB - WB - end - WA - RC - WC - end

So in detail:

  • begin: begins the first transaction
  • RA: reads balance for account A 20 euro
  • begin: begins the second transaction
  • RA: reads balance for account A 20 euro (since previous change is not committed and when transaction started the snapshot of the database had still the original balance of 20 euro for account A)
  • WA: write the new balance of 10 Euro for account A
  • RB: reads balance for account B 30 euro
  • WB: write new balance of 40 euro for account B
  • end: ends the first transaction
  • WA: writes the new balance of 10 euro for account A (since this transaction also read the 20 euro starting balance due to MVCC and subtracted 10 from it)
  • RC: reads balance for account C 40 euro
  • WC: writes the new balance of 50 euro for account C
  • end: ends the second transaction


This is a serialization anomaly and this example hopefully illustrated why this is a problem. If it is unclear imagine you are a bank managing these accounts.  Your system has just generated money for your customer who could withdraw or use that money! The above scenario is well possible if you picture an account with multiple users. Two different users could easily wire money to different accounts concurrently around the same time. So for these type of workloads it is important to avoid serialization anomalies.

Serializable isolation to the Rescue

It is possible to avoid serialization anomalies in database engines that allow running with the strictest transaction isolation level; serializable isolation.

Serializable isolation states that in order for a concurrent execution of transactions to be valid a serial ordering of transactions has to exist such that each statement execution has the same results as in the concurrent execution.

So if we start from our concurrent execution in our example. Since there are 2 transactions we have 2 possible orderings:

  • Tx1 followed by Tx2
  • Tx2 followed by Tx1

Or in our per statement illustration:

  • begin - RA - WA - RB - WB - end - beginRA - WA - RCWC - end
  • beginRAWA - RC - WCend - begin - RA - WA - RB - WB - end

I leave it to the reader to detail the outcomes of the second ordering. You should arrive to the conclusion that account A will always have a balance of 0 after both transactions finish no matter how you order these 2 transactions (serially). Therefore if you have a database engine running in transaction level serializable isolation and you try to do the concurrent execution then that engine will give an error. This does not mean there is an issue in the Database engine but rather that there is a problem with your workload. If it occurs rarely it's possible to just have your application catch these exceptions and retry. Because when this error is thrown generally the database engine will rollback 1 of the transactions, often the transaction that submitted the statement that would give rise to the anomaly.

How can the DB know?

There is no such thing as magic (not even in the database world) so the engine must have some clever way of detecting that a statement would cause a serialization anomaly. Working backwards from the behavior of serially ordered transactions you could come op with 2 rules:

  1. If data is read in a transaction TX that was written by another transaction  TC which was committed before the start of Tx then in a sequential ordering Tx must follow TC (order: TC -> TX)
  2. If data is read in a transaction TY that is written by another transaction that overlaps but wasn't committed when TY started then Ty must precede TU (order TY -> TU) 
 Note that
  • Data read and written within the same transaction doesn't impose ordering
  • 2 Reads from different transactions also don't impose an ordering
  • Since more than 2 transactions can overlap a read-only transaction could still give rise to a serialization anomaly!

In concurrent executions that are a serialization anomaly you will get ordering rules that will be in conflict. For our example:

Begin - RA - begin - RA - WA - RB - WB - end - WA - RC - WC - end

  • begin: begins the first transaction
  • RA: reads balance for account A
  • begin: begins the second transaction
  • RA: reads balance for account A 
    • 2 reads don't impose ordering between each other
  • WA: write the new balance of account A
    • At this point the DB knows that T2 reads data written to by an overlapping transaction T1 which is uncommitted so T2 must precede T1 (T2 -> T1 )
  • RB: reads balance for account B
  • WB: write new balance for account B
  • end: ends the first transaction
  • WA: writes the new balance for account A 
    • Now the DB knows that T1 reads data written by T2 which was not committed at starttime of T1 so T1 has to precede T2 (T1 -> T2)

 At this stage the DB engine can throw an exception as no matter what happens further with T2 (except for a transaction rollback) it will yield a serialization anomaly. Therefore it is best to rollback the transaction immediately and don't allow further statements as their results shouldn't be relied upon anyway.

The advantage of the arrow notation is that you can chain them together and as soon as you encounter a transaction for a second time you know there is a violation. So if we chain them in the order we found them:

(T2 -> T1 ) || (T1 -> T2) => (T2 -> T1  -> T2)

 A directed graph would be a useful way of tracking these orderings discovered by applying the rules. Cycles would in this case indicate a serialization anomaly so we can use a directed acyclic graph (DAG). 

In summary

Transactions should see the database as if they were running alone on the system, not being impacted by other running transactions. This is important because we don't know what will happen to these transactions, they could be aborted and in that case relying on the data written by them would cause an anomaly in itself. Ironically it is the measures we put in place to protect against these anomalies that could give rise to a serialization anomaly where results from statements of overlapping transactions wouldn't be retrievable if the transactions had happened serially. This post showcases that for MVCC and aims at providing intuition into why these serialization anomalies are problematic and give basic insights in what a database can do to protect you from it. That is if you chose to run with a transaction level of serializable isolation.

 

Final Notes:

I haven't gone into much detail about locking. Locking can be used to avoid serializable isolation anomalies but it requires aggressive locks (e.g. exclusive locks) that would enforce serial access to the underlying resource. The example is chosen in such a way that the concurrent execution would be possible even with table level locking. For example where during normal database execution writes would block other writes. This is because locks follow the lifetime of a transaction and T1 ended before a write in T2 could give rise to contention on a lock.

I have tried to give a simple explanation for the tracking mechanism above. Note that my example covered only rule 2. Rule 2 is easiest to track as you only need to take into account open transactions. Rule 1 on the other hand uses committed transactions which is troublesome to track since their amount grows with the uptime of your database. You can stop keeping track of committed transactions from the moment they no longer have (direct or indirect) overlap with open transactions. This overlap could disappear because transactions can close (by commit or rollback). But a lot of database instances have a continuous workload which could block this type of cleanup since there always remains overlap with open transactions. The good news is that research exists which investigates use of heuristics to efficiently avoid serialization anomalies. A nice and freely available paper on such a heuristic is "Efficiently making (almost) any concurrency control mechanism serializable".