Skip to content

Commit 99cca24

Browse files
Feat!: Add support for concurrent table diff of multiple models (#4256)
1 parent f800529 commit 99cca24

File tree

14 files changed

+852
-210
lines changed

14 files changed

+852
-210
lines changed

docs/guides/model_selection.md

Lines changed: 76 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@
22

33
This guide describes how to select specific models to include in a SQLMesh plan, which can be useful when modifying a subset of the models in a SQLMesh project.
44

5-
Note: the selector syntax described below is also used for the SQLMesh `plan` [`--allow-destructive-model` selector](../concepts/plans.md#destructive-changes).
5+
Note: the selector syntax described below is also used for the SQLMesh `plan` [`--allow-destructive-model` selector](../concepts/plans.md#destructive-changes) and for the `table_diff` command to [diff a selection of models](./tablediff.md#diffing-multiple-models-across-environments).
66

77
## Background
88

@@ -221,6 +221,81 @@ Models:
221221
└── sushi.customer_revenue_lifetime
222222
```
223223

224+
#### Select with tags
225+
226+
If we specify the `--select-model` option with a tag selector like `"tag:reporting"`, all models with the "reporting" tag will be selected. Tags are case-insensitive and support wildcards:
227+
228+
```bash
229+
❯ sqlmesh plan dev --select-model "tag:reporting*"
230+
New environment `dev` will be created from `prod`
231+
232+
Differences from the `prod` environment:
233+
234+
Models:
235+
├── Directly Modified:
236+
│ ├── sushi.daily_revenue
237+
│ └── sushi.monthly_revenue
238+
└── Indirectly Modified:
239+
└── sushi.revenue_dashboard
240+
```
241+
242+
#### Select with git changes
243+
244+
The git-based selector allows you to select models whose files have changed compared to a target branch (default: main). This includes:
245+
- Untracked files (new files not in git)
246+
- Uncommitted changes in working directory
247+
- Committed changes different from the target branch
248+
249+
For example:
250+
251+
```bash
252+
❯ sqlmesh plan dev --select-model "git:feature"
253+
New environment `dev` will be created from `prod`
254+
255+
Differences from the `prod` environment:
256+
257+
Models:
258+
├── Directly Modified:
259+
│ └── sushi.items # Changed in feature branch
260+
└── Indirectly Modified:
261+
├── sushi.order_items
262+
└── sushi.daily_revenue
263+
```
264+
265+
You can also combine git selection with upstream/downstream indicators:
266+
267+
```bash
268+
❯ sqlmesh plan dev --select-model "git:feature+"
269+
# Selects changed models and their downstream dependencies
270+
271+
❯ sqlmesh plan dev --select-model "+git:feature"
272+
# Selects changed models and their upstream dependencies
273+
```
274+
275+
#### Complex selections with logical operators
276+
277+
The model selector supports combining multiple conditions using logical operators:
278+
279+
- `&` (AND): Both conditions must be true
280+
- `|` (OR): Either condition must be true
281+
- `^` (NOT): Negates a condition
282+
283+
For example:
284+
285+
```bash
286+
❯ sqlmesh plan dev --select-model "(tag:finance & ^tag:deprecated)"
287+
# Selects models with finance tag that don't have deprecated tag
288+
289+
❯ sqlmesh plan dev --select-model "(+model_a | model_b+)"
290+
# Selects model_a and its upstream deps OR model_b and its downstream deps
291+
292+
❯ sqlmesh plan dev --select-model "(tag:finance & git:main)"
293+
# Selects changed models that also have the finance tag
294+
295+
❯ sqlmesh plan dev --select-model "^(tag:test) & metrics.*"
296+
# Selects models in metrics schema that don't have the test tag
297+
```
298+
224299
### Backfill examples
225300

226301
#### No backfill selection

docs/guides/tablediff.md

Lines changed: 49 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -122,6 +122,55 @@ Under the hood, SQLMesh stores temporary data in the database to perform the com
122122
The default schema for these temporary tables is `sqlmesh_temp` but can be changed with the `--temp-schema` option.
123123
The schema can be specified as a `CATALOG.SCHEMA` or `SCHEMA`.
124124

125+
126+
## Diffing multiple models across environments
127+
128+
SQLMesh allows you to compare multiple models across environments at once using model selection expressions. This is useful when you want to validate changes across a set of related models or the entire project.
129+
130+
To diff multiple models, use the `--select-model` (or `-m` for short) option with the table diff command:
131+
132+
```bash
133+
sqlmesh table_diff prod:dev --select-model "sqlmesh_example.*"
134+
```
135+
136+
When diffing multiple models, SQLMesh will:
137+
138+
1. Show the models returned by the selector that exist in both environments and have differences
139+
2. Compare these models and display the data diff of each model
140+
141+
> Note: Models will only be data diffed if there's a breaking change that impacts them.
142+
143+
The `--select-model` option supports a powerful selection syntax that lets you choose models using patterns, tags, dependencies and git status. For complete details, see the [model selection guide](./model_selection.md).
144+
145+
> Note: Surround your selection pattern in single or double quotes. Ex: `'*'`, `"sqlmesh_example.*"`
146+
147+
Here are some common examples:
148+
149+
```bash
150+
# Select all models in a schema
151+
sqlmesh table_diff prod:dev -m "sqlmesh_example.*"
152+
153+
# Select a model and its dependencies
154+
sqlmesh table_diff prod:dev -m "+model_name" # include upstream deps
155+
sqlmesh table_diff prod:dev -m "model_name+" # include downstream deps
156+
157+
# Select models by tag
158+
sqlmesh table_diff prod:dev -m "tag:finance"
159+
160+
# Select models with git changes
161+
sqlmesh table_diff prod:dev -m "git:feature"
162+
163+
# Use logical operators for complex selections
164+
sqlmesh table_diff prod:dev -m "(metrics.* & ^tag:deprecated)" # models in the metrics schema that aren't deprecated
165+
166+
# Combine multiple selectors
167+
sqlmesh table_diff prod:dev -m "tag:finance" -m "metrics.*_daily"
168+
```
169+
170+
When multiple selectors are provided, they are combined with OR logic, meaning a model matching any of the selectors will be included.
171+
172+
> Note: All models being compared must have their `grain` defined that is unique and not null, as this is used to perform the join between the tables in the two environments.
173+
125174
## Diffing tables or views
126175

127176
Compare specific tables or views with the SQLMesh CLI interface by using the command `sqlmesh table_diff [source table]:[target table]`.

docs/reference/cli.md

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -529,7 +529,7 @@ Options:
529529
```
530530
Usage: sqlmesh table_diff [OPTIONS] SOURCE:TARGET [MODEL]
531531
532-
Show the diff between two tables.
532+
Show the diff between two tables or multiple models across two environments.
533533
534534
Options:
535535
-o, --on TEXT The column to join on. Can be specified multiple
@@ -548,6 +548,7 @@ Options:
548548
--temp-schema TEXT Schema used for temporary tables. It can be
549549
`CATALOG.SCHEMA` or `SCHEMA`. Default:
550550
`sqlmesh_temp`
551+
-m, --select-model TEXT Select specific models to table diff.
551552
--help Show this message and exit.
552553
```
553554

docs/reference/notebook.md

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -293,7 +293,7 @@ Create a schema file containing external model schemas.
293293
%table_diff [--on [ON ...]] [--skip-columns [SKIP_COLUMNS ...]]
294294
[--model MODEL] [--where WHERE] [--limit LIMIT]
295295
[--show-sample] [--decimals DECIMALS] [--skip-grain-check]
296-
[--temp-schema SCHEMA]
296+
[--temp-schema SCHEMA] [--select-model [SELECT_MODEL ...]]
297297
SOURCE:TARGET
298298
299299
Show the diff between two tables.
@@ -320,6 +320,8 @@ options:
320320
--skip-grain-check Disable the check for a primary key (grain) that is
321321
missing or is not unique.
322322
--temp-schema SCHEMA The schema to use for temporary tables.
323+
--select-model <[SELECT_MODEL ...]>
324+
Select specific models to diff using a pattern.
323325
```
324326

325327
#### model

sqlmesh/cli/main.py

Lines changed: 10 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -892,18 +892,26 @@ def create_external_models(obj: Context, **kwargs: t.Any) -> None:
892892
type=str,
893893
help="Schema used for temporary tables. It can be `CATALOG.SCHEMA` or `SCHEMA`. Default: `sqlmesh_temp`",
894894
)
895+
@click.option(
896+
"--select-model",
897+
"-m",
898+
type=str,
899+
multiple=True,
900+
help="Specify one or more models to data diff. Use wildcards to diff multiple models. Ex: '*' (all models with applied plan diffs), 'demo.model+' (this and downstream models), 'git:feature_branch' (models with direct modifications in this branch only)",
901+
)
895902
@click.pass_obj
896903
@error_handler
897904
@cli_analytics
898905
def table_diff(
899906
obj: Context, source_to_target: str, model: t.Optional[str], **kwargs: t.Any
900907
) -> None:
901-
"""Show the diff between two tables."""
908+
"""Show the diff between two tables or a selection of models when they are specified."""
902909
source, target = source_to_target.split(":")
910+
select_models = {model} if model else kwargs.pop("select_model", None)
903911
obj.table_diff(
904912
source=source,
905913
target=target,
906-
model_or_snapshot=model,
914+
select_models=select_models,
907915
**kwargs,
908916
)
909917

0 commit comments

Comments
 (0)