Both InnoDB and PostgreSQL - as well as many other databases - use a technique called multi-version concurrency control (MVCC) to provide transaction isolation: transactions should not see the work of other, uncommitted transactions. MVCC means that, when a row is updated, the database stores both the old and new versions of the row.


Thursday, October 14, 2010

Postgrsql 9.0 beta - Relased with New Features

Postgresql 9.0 beta - relased

PostgreSQL 9.0 has been released with new interesting features, hot standby & streaming replication

 How it works

First we need to clear the concept of physical hot standby. Normally we are using physical warm standby, i.e. a replicated system that start from a backup and is continuing recovering with archive, but it's not accessible for read only transactions. The other possibility is the logical standby that permits query, but it's normally a replicated query system with problems on random functions or DDL .

Hot standby is the union of features of the two: solidity of recover plus the read only access to the databases. This is possible because it's a warm standby that is continuously applying the wal segments ( sent with streaming replication), and moreover encapsulates the transactions in a read-only stack.

The process of generation of an hot standby is so:

  1.  hot backup from master

  2. restore of backup on a slave

  3. startup of a slave and begin the catchup phase with receiver process that asks the wal segments to the sender process for writing wal segments and so permits to the startup process to recover until last wal segment

  4. when it have recovered all wal segments and finally catches up the master, it starts with streaming replication

  5. now the standby is continuously update in async mode.

  6. Now that streaming is function the sender is reading from current wal segment and send the modification to the receiver that write to wal segments, while startup process continue to recover the standby using information of wal segments


Now , what are the limits of the technology? In practice there are 2 limits:

  1. the transaction could not lock in exclusive mode on standby node

  2. the transaction has to abort if the recovery is cleaning the death tuples.

The second point is very interesting. If there is a long query on standby is possible that is reading "old" rows that is begging cleaned up by vacuum in the master node, so when recovery start to clean rows, it not cares about the long query. Why?
The answer is simple, the master doesn't know about what transactions are running in the standby.

For resolve this type of issue there are some practical solutions :

  1. use the new parameter "max_standby_delay" that permits to delay the recovery so the query could finish , the value is in seconds. If it is configured to -1 values, it means infinite time.

  2. vacuum_defer_cleanup_age , the number of transactions by which cleanup of death tuples are deferred

No comments:

Post a Comment