Concurrency control in web development

Feb 26 2017

One of the foundations, when talking about applications related to ecommerce, finance etc. is Consistency. Consistency is one of the transaction database properties called ACID (Atomicity, Consistency, Isolation, Durability). It means that a transaction either creates a new and valid state of data, or, if any failure occurs, returns all data to its state before the transaction was started.
Sometimes we tend to think that the database takes care of concurrency issues automatically but unfortunately we are wrong… It’s true that most of the relational databases do support ACID foundations but if we don’t handle it properly in the application server side, we may encounter concurrency problems when more than one user is working on the system.

In this post, I’m going to talk about locking and how you can use it to protect data consistency.

Defining the problem

First of all, we have to understand what the problem is and why we have to lock objects. Let’s start with an example. Assume that there is a bank account with a balance of a $1000. The following transaction describes a single user withdrawal:

ATM Transaction
balance = read_bal (accountID);
balance = balance – 100;
write_bal (accountID, balance);

In the example above, everything works properly and we don’t have to worry about concurrency because only one user is working with the database.

Now, let’s examine a scenario in which two people try to withdraw $100 from the same account at the same time.

  • Both ATMs check the balance of the account and see it’s $1000
  • The first user withdraw $100 from $1000
  • The second user withdraw $100 from $1000
  • The first user writes $900 to the account
  • The Second user writes $900 to the account

Finally, $200 were withdrawn from the account but the balance stands on $900 which means that only $100 was deducted from the account instead of $200.

The following figure describes that scenario.

ATM Transaction 1 (User1)ATM Transaction 2 (User2)DB state
balance = read_bal (accountID);$1000
balance = read_bal (accountID);$1000
balance = balance – 100;$1000
balance = balance – 100;$1000
write_bal (accountID, balance);$900
write_bal (accountID, balance);$900

How can we solve this problem? Just lock!

There are many articles talking about locking theory but here I would like to detail them in a nutshell.
Basically we have the following locking strategies:

  • Resource Locking – “Overly Optimistic” − Assumes conflicts will never occur − for single-user systems, read-only tables, or where records are guaranteed to only be accessed by one person at a time.
  • Optimistic − Assumes conflict will generally not occur − lock obtained after transaction processed − It’s better when lock granularity is large.
  • Pessimistic − Assumes conflict will generally occur − lock obtained before transaction processing and released afterward − Better when lock granularity is small.

Getting started with optimistic locking

Generally speaking, locking is a technique that allows us to prevent users from overwriting changes that made by other users. We would like to prevent cases in which a user enters an application to make a change to a piece of data, and just before the user submits, another user comes and submits a change.

To start with optimistic locking you simply have to add the column lock_version to the tables you would like to enable locking:  


class CreateAccounts < ActiveRecord
 def change
   create_table :accounts do |t|
     t.string :user_name
     t.integer :balance
     t.integer :lock_version
     t.timestamps
   end
 end
end

When the lock_version exists, every update, Rails increments the lock_version value. In this case, if two users try to update the object at the same time an ActiveRecord::StaleObjectError error will be raised.

Now, let’s see the common ATM scenario. Two users are trying to withdraw money from the same account


my_account =  Account.find(1)
my_account.balance = my_account.balance - 100
My_account.save

In this case, the generated SQL increments the lock_vesion value and makes sure that the value hasn’t been changed since the data was retrieved.

UPDATE "accounts" SET "balance" = 100, "updated_at" = '2016-11-30 20:07:54.500467', "lock_version" = 5 WHERE "accounts"."id" = $1 AND "accounts"."lock_version" = $2  [["id", 1], ["lock_version", 4]]

Now, let’s examine a scenario in which two transactions are executed concurrently on the same account:


// User 1 on account A
my_account1 =  Account.find(1) // read lock_version=1
my_account1.balance = my_account1.balance - 50
My_account1.save // write lock_version=2 and check that the database contains lock_version=1

// User 2 on account A
my_account2 =  Account.find(1) // read lock version=1
my_account2.balance = my_account2.balance - 100
My_account2.save // write lock_version=2 and check that the database contains lock_version=1 ⇒ fails due to the fact that the first transaction has just updated the lock_verison to 2

In this case, the second user will get the following exception:

ActiveRecord::StaleObjectError: Attempted to update a stale object: Account.

Pessimistic locking

When using pessimistic locking, only the first user can access the object and will be able to update it where. All other users will not be allowed to read it until the first user commits the update.
The pessimistic locking is a built in technique in rails. Calling the lock! function will generate an SQL that contains the FOR UPDATE clause in which locks the row in the database until a transaction commit is issued or a timeout occurs.


my_account =  Account.find(1)
My_account.lock! 

SELECT  "accounts".* FROM "accounts" WHERE "accounts"."id" = $1 LIMIT $2 FOR UPDATE  [["id", 1], ["LIMIT", 1]]

Now, let’s examine the ATM problem with two concurrent transactions.

// User 1 on account A
my_account1 =  Account.find(1)
my_account1.lock!
my_account1.balance = my_account1.balance - 50
My_account1.save

// User 2 on account A
my_account2 =  Account.find(1) my_account2.lock! my_account2.balance = my_account2.balance - 100 My_account2.save

In the example above, the user who is the first one to execute the lock can access the object (Read/Write) while user 2 is not allowed to read it until the first user commits the update.

Summary

Knowing which locking scheme should be used totally depends on the requirements.
So to summarize

Optimistic locking

When using optimistic locking, users do not lock data when they read it. When a user updates data, the application checks if another user changed the data after it was read. If another user updated the data, an error is raised. In this case, the user receiving the error rolls back the transaction and starts again. This is called optimistic because it is mainly used in environments where there is low contention for data, and where the cost of occasionally rolling back a transaction is lower than the cost of locking data when read.
Pros:

  1. No use of semaphores and mutex, the application gains better performance and scalability.
  2. The data manipulation of the first transaction that performs the update is persisted, while the other transactions are informed of the change and requested to repeat the update on the modified data.

Cons:

  1. Users have to perform the updates again, if another transaction updates the data after it was read by the user process. In multi user systems this may cause frustration.

Pessimistic locking

A system of locks prevents users from modifying data in a way that affects other users. After a user performs an action that causes a lock to be applied, other users cannot perform actions that would conflict with the lock until the owner releases it. This is called pessimistic lock because it is mainly used in environments where there is high contention for data, where the cost of protecting data with locks is less than the cost of rolling back transactions if a concurrency conflicts occur.
Pros:
  1. Once a user achieves a lock, a rollback due to race condition will not occur.
  2. Suitable for situations where a process is long and may consist of multiple transactions.

Cons:

  1. If the lock is not released properly, it can lead to degradation in terms of performance over time.
  2. A deadlock can occur when a circular dependency is introduced between two transactions.

References

Dan Y.
Software Developer
Back to Blog