Concept
Databases

Schema migrations with zero downtime

Expand and contract, dual writes, backfills.

You want to rename a column on a 200 GB table. The naive approach is to take the system down, run ALTER TABLE, and bring it back up. On a small system that is a 5-minute outage. On a big system it is hours of downtime no one will let you schedule. The way to ship schema changes safely is to never lock the table at all: deploy the change as a series of small, reversible steps where the application stays up the entire time.

The pattern: expand and contract

Every safe schema change follows the same shape. Add the new thing, write to both for a while, switch the readers over, then remove the old thing.

sequenceDiagram
    autonumber
    participant App as Application
    participant DB as Database

    Note over App,DB: Step 1 — Expand. Add the new column.
    App->>DB: ALTER TABLE ADD COLUMN email_v2 TEXT
    Note over DB: New column nullable, no rewrite.

    Note over App,DB: Step 2 — Dual write. Write both old and new.
    App->>DB: INSERT/UPDATE old + new together
    Note over App: Code reads from old, writes both.

    Note over App,DB: Step 3 — Backfill. Copy old to new for existing rows.
    App->>DB: UPDATE ... SET email_v2 = email WHERE email_v2 IS NULL
    Note over DB: Run in batches, off-peak, idempotent.

    Note over App,DB: Step 4 — Switch reads.
    App->>DB: deploy code that reads email_v2 instead of email
    Note over App: Verify metrics, error rates, dashboards.

    Note over App,DB: Step 5 — Stop dual write. Read and write only new.
    App->>DB: stop writing email, just write email_v2

    Note over App,DB: Step 6 — Contract. Drop the old column.
    App->>DB: ALTER TABLE DROP COLUMN email

Each step is one deploy. Between steps, both versions of the code (old and new) can run side by side. That is the whole point: you can roll back any single step without a hot fix.

Why the naive approach hurts

flowchart TB
    subgraph BAD["The dangerous one-step migration"]
        direction LR
        A1["deploy new code"]:::client
        A2["ALTER TABLE<br/>RENAME email<br/>TO email_v2"]:::infra
        A3[("Table locked<br/>for minutes or hours")]:::dead
        A1 --> A2 --> A3
    end

    subgraph GOOD["Expand and contract"]
        direction LR
        B1["ADD COLUMN"]:::ok
        B2["dual write"]:::ok
        B3["backfill"]:::ok
        B4["switch reads"]:::ok
        B5["stop dual write"]:::ok
        B6["DROP COLUMN"]:::ok
        B1 --> B2 --> B3 --> B4 --> B5 --> B6
    end

    classDef client fill:#dbeafe,stroke:#1e40af,color:#1e3a8a,stroke-width:1.5px
    classDef infra fill:#fef3c7,stroke:#a16207,color:#713f12,stroke-width:1.5px
    classDef dead fill:#fecaca,stroke:#b91c1c,color:#7f1d1d,stroke-width:1.5px
    classDef ok fill:#dcfce7,stroke:#15803d,color:#14532d,stroke-width:1.5px

RENAME is the most-cited example but the same applies to: changing a column type, splitting a table in two, merging two columns, dropping NOT NULL, adding NOT NULL, adding a foreign key. Anything that locks a big table for the duration of the change is something users will notice.

The change-data-capture variant

For very large tables or zero-tolerance windows, replace the application-level dual write with a change-data-capture (CDC) stream. The application keeps writing to the old column; a CDC pipeline mirrors every change into the new column in near real time. Cleaner separation, but more infrastructure.

flowchart LR
    APP(["Application"]):::client
    OLD[("Old column / table")]:::store
    CDC[["CDC stream<br/>(Debezium, Maxwell)"]]:::infra
    NEW[("New column / table")]:::store

    APP ==>|"writes"| OLD
    OLD ==>|"WAL events"| CDC
    CDC ==>|"apply"| NEW

    classDef client fill:#dbeafe,stroke:#1e40af,color:#1e3a8a,stroke-width:1.5px
    classDef infra fill:#fef3c7,stroke:#a16207,color:#713f12,stroke-width:1.5px
    classDef store fill:#e9d5ff,stroke:#7e22ce,color:#581c87,stroke-width:1.5px

This is how big migrations of partitioned tables, sharded clusters, or whole-database moves are usually done.

When you really cannot avoid downtime

  • Changing a primary key on a huge table.
  • Cross-database moves where transactions matter.
  • Adding NOT NULL on a column with billions of nulls (you can backfill first, but the final lock is short).

Even here, you minimise: announce the window, plan rollbacks, and rehearse the cutover in a staging environment.

Two scenarios

Scenario one: renaming a column on a hot table.

Users table. 100 million rows. You want to rename email to email_address. Six deploys (add column, dual write, backfill, switch reads, stop old write, drop column). Each deploy is small, reversible, and ships behind a feature flag. Total elapsed time across one or two weeks. Zero downtime.

Scenario two: changing a column type.

amount is INT (cents); you need DECIMAL (for fractional cents and bigger numbers). Same pattern: add amount_v2 DECIMAL. Dual write with conversion. Backfill in batches with idempotent UPDATE. Switch readers. Drop the old column.

What this connects to

  • Read replicas. During a migration you want the replica to be caught up before you cut over. Watch replication lag. See Read replicas.
  • Idempotency. Backfills get retried, replayed, and resumed. They must be safe to run twice. See Idempotency.
  • Indexes. Creating a new index also blocks writes unless you use CREATE INDEX CONCURRENTLY (Postgres) or equivalent. See Indexes that help, indexes that hurt.
  • Graceful degradation. Feature-flag the new reader path so a problem during migration can be turned off without rolling back code. See Graceful degradation.

Common mistakes

  • Running ALTER TABLE directly on a busy table. In Postgres, even adding a column with a default can rewrite the whole table on older versions. Check the docs for your engine and version.
  • Skipping the dual-write phase. “It is just a rename, we will do it in one deploy” lasts until the deploy is half rolled out and old pods are writing to a column that no longer exists.
  • Backfilling in one huge transaction. Long transactions hold locks and block other work. Backfills must be batched and idempotent: 1,000 rows per batch, a small sleep between batches.
  • Forgetting the read path. Switching writes is half the work. Readers need to be updated too, and there is often a delay between the two changes.
  • No rollback plan. Each step should be independently reversible. If step 4 introduces a regression, you must be able to roll back to step 3 without a new migration.
  • Migrating data without a verification step. After backfill, run a count and a checksum comparison. Without that, drift goes undetected for months.

Quick recap

  • Never lock a big table in production. Decompose every change into “expand, dual write, backfill, switch, contract.”
  • Each step is its own deploy, behind a feature flag, independently reversible.
  • For very large or sensitive tables, use a CDC stream instead of application-level dual writes.
  • Backfills must be batched, idempotent, and verified.
  • The senior question on any migration: “what is the rollback plan for each step?”

This concept sits in Stage 2 (Storage and data) of the System Design Roadmap.