Monday, February 13, 2012

Available strategies to upgrade a database schema without downtime

There is a great book on database refactoring that contains a comprehensive set or recipies on how to revise databases that are supposed to be always online and may have various clients that can't be upgraded at the same time. I guess this is a typical case with large databases and I would be surpised if Amazon stops their servers just to move a column from one table to another. The book describes necessary steps for such changes. Basically it's all about creating intermediate database schemas that would be used during transition period.

For example, if we need to move a column from one table to another:

Version 1.
Table A columns: Name, Price
Table B columns: Quantity, Date

Let's say we move Price to table B:

Version 2.
Table A columns: Name
Table B columns: Quantity, Date, Price

The book suggests an intermediate version:

Version 1_2.
Table A columns: Name, Price
Table B columns: Quantity, Date, Price
Additional trigger that will synchronize "Price" columns between A and B.

Version 1_2 can be used by both clients written for version 1 and 2. Software developers don't need to rush their upgrades, transition can last months and include several changes.

This technique requires accuracy in version control management, but looks very good to implement non-interruptible database schema upgrade. I wonder if this is the only option available for data schema upgrade with no downtime. I can't think about anything else - it this how large data warehouses updata their databases?That's a sensible method. Another is to create a view (name,price) and expose it as TableA. With an instead of trigger, you then can update the correct base tables.

No comments:

Post a Comment