Practice-problem
Problem #77 Medium Batch Pipelines & Orchestration

dbt Incremental vs Full Refresh

dbtincrementalMERGEmaterialisation

Scenario: A dbt project has 40 models. Most are materialised as table, which means every dbt run rebuilds the whole warehouse from scratch. The largest fact table now has 800 million rows and the daily run takes 90 minutes. The bill is climbing. A teammate suggests changing every model to incremental. You say “not every model, and not the way you think.” You are asked to teach the difference.

In the interview, the question is:

When should a dbt model be incremental, when should it be a full refresh, and what does the incremental materialisation actually do under the hood?


Your Task:

  1. Explain the three dbt materialisations (view, table, incremental) in one sentence each.
  2. Walk through what is_incremental() does and how a typical incremental model looks.
  3. Cover the three incremental strategies (append, merge, delete+insert) and when each fits.
  4. Explain when an incremental model is the wrong choice.

What a Good Answer Covers:

  • MERGE under the hood, generated by dbt.
  • The unique key and what happens without one.
  • Late-arriving data and the lookback window.
  • When a full refresh is still the right answer (small dim tables, model logic changed).
  • --full-refresh flag and when to run it.

Try the problem on your own first. Solutions are most valuable after you've struggled with it.