Tuesday, January 22, 2013

Role of MVCC for for in-memory applications to boost performance and concurrency as opposed to OCC. In MVCC, each transaction generates a unique start-timestamp from a monotonically increasing counter and when it ends, it gets a unique end-timestamp. The transaction commits only if there is no overlap with another start/end-transaction pair. This way locking is avoided and a version is generated for each change. The role of MVCC is different from OCC even though both may be an add-on to two-phase locking (2PL) to ensure isolation. In OCC, multiple transactions are allowed to read and update an item without blocking. Instead, transactions maintain histories of their reads and writes which are checked for isolation conflicts before the transaction commits. If a conflict is detected, one of the conflicting transactions is rolled back. The cost of rollbacks and retries increases with conflicts in the OCC scheme. The 2PL refers to the method of locking where a shared lock is obtained on all items for reading and an exclusive lock is acquired on all items written to. The transaction blocks on a wait queue while waiting to acquire a lock. In order to reduce locking, hierarchical locks may be provided. Talking about isolation levels, here are the ones ANSI SQL defines:
1. READ UNCOMMITTED: Any version of the data can be read without taking locks. This is achieved in a locking implementation by read requests proceeding without acquiring any locks.
2. READ COMMITTED: A transaction may read any committed version of the data. Repeated reads of an object may result in different versions being read.
3. REPEATABLE READ: A transaction will read only one version of committed data. Once the transaction reads an object, it will always read the same version of that object until the end of the transaction.
4. SERIALIZABLE: All reads and writes are serialized.

The phantom problem is described as follows:  Repeated access of the transaction see tuples that were not seen on the first access. This happens because the locks are for the tuples and does not prevent insert of new tuples. Two phase locking of tables prevents phantoms, but table level locking can be restrictive in cases where transaction access only a few tuples with an index.

Other than the ANSI SQL levels, these additional levels are also popular.
CURSOR STABILITY: This is used to fix the "lost update" nuance of read committed. The lost update is when the changes made by one transaction  are lost because another makes an update. This is solved by holding a lock on the most recently read item on a cursor and dropped when the cursor is moved or the transaction terminates.
SNAPSHOT ISOLATION uses the MVCC to version every change by a transaction and prevent the overlap by any two start/end-transaction pairs.
READ CONSISTENCY is another MVCC where the locks are acquired at the statement level so the statements see the most recently committed value at the start and uses that logical version. A single transaction could use many different logical version. Only the most recent version is stored and older versions are retrieved using undo log records. Modifications are maintained via long term write locks and first writer wins.
 

No comments:

Post a Comment