Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
26 commits
Select commit Hold shift + click to select a range
95cb984
Feat: Add support for concurrent table diff across all impacted models
themisvaltinos Apr 25, 2025
f0df829
display which tables have changes
themisvaltinos Apr 25, 2025
bdd885a
fix noop
themisvaltinos Apr 25, 2025
e45ed14
split model diff and table diff functionality
themisvaltinos Apr 28, 2025
13b3fb6
refactor; clean up code
themisvaltinos Apr 29, 2025
2fde2d4
style
themisvaltinos Apr 29, 2025
25eb039
speed up execution by moving row_diff in concurrent method
themisvaltinos Apr 29, 2025
c217923
refactors; add progress bar; add info
themisvaltinos Apr 30, 2025
ec782e7
break up table diff console
themisvaltinos May 1, 2025
105390d
refactors; ux improvements
themisvaltinos May 2, 2025
ad92bfe
refactor on method in modelmeta
themisvaltinos May 2, 2025
b4d8fd7
refactors; improve error handling and messages; update docs
themisvaltinos May 5, 2025
507981b
clean up logic for non existing models per env; revise doc
themisvaltinos May 5, 2025
3c88f76
Apply suggestions from code review
themisvaltinos May 5, 2025
e474164
cleanups; add backticks for code strings
themisvaltinos May 5, 2025
e3040c8
revise docs
themisvaltinos May 6, 2025
752312e
fix comment
themisvaltinos May 6, 2025
0eb1ac4
remove no differences from the cli output
themisvaltinos May 6, 2025
1452eaa
Revert "remove no differences from the cli output"
themisvaltinos May 6, 2025
6de5f8e
revise logic so that info is provided for small number of models
themisvaltinos May 6, 2025
b1bbd4f
add unit test
themisvaltinos May 6, 2025
d9d44f4
fix incorrect message
themisvaltinos May 6, 2025
0a8d0f5
add similar message when no models contain differences
themisvaltinos May 6, 2025
5ad039f
remove without changes section
themisvaltinos May 6, 2025
9f2091c
fix to only show one message
themisvaltinos May 6, 2025
0177f02
cleanup unused constant
themisvaltinos May 6, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
77 changes: 76 additions & 1 deletion docs/guides/model_selection.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@

This guide describes how to select specific models to include in a SQLMesh plan, which can be useful when modifying a subset of the models in a SQLMesh project.

Note: the selector syntax described below is also used for the SQLMesh `plan` [`--allow-destructive-model` selector](../concepts/plans.md#destructive-changes).
Note: the selector syntax described below is also used for the SQLMesh `plan` [`--allow-destructive-model` selector](../concepts/plans.md#destructive-changes) and for the `table_diff` command to [diff a selection of models](./tablediff.md#diffing-multiple-models-across-environments).

## Background

Expand Down Expand Up @@ -221,6 +221,81 @@ Models:
└── sushi.customer_revenue_lifetime
```

#### Select with tags

If we specify the `--select-model` option with a tag selector like `"tag:reporting"`, all models with the "reporting" tag will be selected. Tags are case-insensitive and support wildcards:

```bash
❯ sqlmesh plan dev --select-model "tag:reporting*"
New environment `dev` will be created from `prod`

Differences from the `prod` environment:

Models:
├── Directly Modified:
│ ├── sushi.daily_revenue
│ └── sushi.monthly_revenue
└── Indirectly Modified:
└── sushi.revenue_dashboard
```

#### Select with git changes

The git-based selector allows you to select models whose files have changed compared to a target branch (default: main). This includes:
- Untracked files (new files not in git)
- Uncommitted changes in working directory
- Committed changes different from the target branch

For example:

```bash
❯ sqlmesh plan dev --select-model "git:feature"
New environment `dev` will be created from `prod`

Differences from the `prod` environment:

Models:
├── Directly Modified:
│ └── sushi.items # Changed in feature branch
└── Indirectly Modified:
├── sushi.order_items
└── sushi.daily_revenue
```

You can also combine git selection with upstream/downstream indicators:

```bash
❯ sqlmesh plan dev --select-model "git:feature+"
# Selects changed models and their downstream dependencies

❯ sqlmesh plan dev --select-model "+git:feature"
# Selects changed models and their upstream dependencies
```

#### Complex selections with logical operators

The model selector supports combining multiple conditions using logical operators:

- `&` (AND): Both conditions must be true
- `|` (OR): Either condition must be true
- `^` (NOT): Negates a condition

For example:

```bash
❯ sqlmesh plan dev --select-model "(tag:finance & ^tag:deprecated)"
# Selects models with finance tag that don't have deprecated tag

❯ sqlmesh plan dev --select-model "(+model_a | model_b+)"
# Selects model_a and its upstream deps OR model_b and its downstream deps

❯ sqlmesh plan dev --select-model "(tag:finance & git:main)"
# Selects changed models that also have the finance tag

❯ sqlmesh plan dev --select-model "^(tag:test) & metrics.*"
# Selects models in metrics schema that don't have the test tag
```

### Backfill examples

#### No backfill selection
Expand Down
49 changes: 49 additions & 0 deletions docs/guides/tablediff.md
Original file line number Diff line number Diff line change
Expand Up @@ -122,6 +122,55 @@ Under the hood, SQLMesh stores temporary data in the database to perform the com
The default schema for these temporary tables is `sqlmesh_temp` but can be changed with the `--temp-schema` option.
The schema can be specified as a `CATALOG.SCHEMA` or `SCHEMA`.


## Diffing multiple models across environments

SQLMesh allows you to compare multiple models across environments at once using model selection expressions. This is useful when you want to validate changes across a set of related models or the entire project.

To diff multiple models, use the `--select-model` (or `-m` for short) option with the table diff command:

```bash
sqlmesh table_diff prod:dev --select-model "sqlmesh_example.*"
```

When diffing multiple models, SQLMesh will:

1. Show the models returned by the selector that exist in both environments and have differences
2. Compare these models and display the data diff of each model

> Note: Models will only be data diffed if there's a breaking change that impacts them.

The `--select-model` option supports a powerful selection syntax that lets you choose models using patterns, tags, dependencies and git status. For complete details, see the [model selection guide](./model_selection.md).

> Note: Surround your selection pattern in single or double quotes. Ex: `'*'`, `"sqlmesh_example.*"`

Here are some common examples:

```bash
# Select all models in a schema
sqlmesh table_diff prod:dev -m "sqlmesh_example.*"

# Select a model and its dependencies
sqlmesh table_diff prod:dev -m "+model_name" # include upstream deps
sqlmesh table_diff prod:dev -m "model_name+" # include downstream deps

# Select models by tag
sqlmesh table_diff prod:dev -m "tag:finance"

# Select models with git changes
sqlmesh table_diff prod:dev -m "git:feature"

# Use logical operators for complex selections
sqlmesh table_diff prod:dev -m "(metrics.* & ^tag:deprecated)" # models in the metrics schema that aren't deprecated

# Combine multiple selectors
sqlmesh table_diff prod:dev -m "tag:finance" -m "metrics.*_daily"
```

When multiple selectors are provided, they are combined with OR logic, meaning a model matching any of the selectors will be included.

> Note: All models being compared must have their `grain` defined that is unique and not null, as this is used to perform the join between the tables in the two environments.

## Diffing tables or views

Compare specific tables or views with the SQLMesh CLI interface by using the command `sqlmesh table_diff [source table]:[target table]`.
Expand Down
3 changes: 2 additions & 1 deletion docs/reference/cli.md
Original file line number Diff line number Diff line change
Expand Up @@ -529,7 +529,7 @@ Options:
```
Usage: sqlmesh table_diff [OPTIONS] SOURCE:TARGET [MODEL]

Show the diff between two tables.
Show the diff between two tables or multiple models across two environments.

Options:
-o, --on TEXT The column to join on. Can be specified multiple
Expand All @@ -548,6 +548,7 @@ Options:
--temp-schema TEXT Schema used for temporary tables. It can be
`CATALOG.SCHEMA` or `SCHEMA`. Default:
`sqlmesh_temp`
-m, --select-model TEXT Select specific models to table diff.
--help Show this message and exit.
```

Expand Down
4 changes: 3 additions & 1 deletion docs/reference/notebook.md
Original file line number Diff line number Diff line change
Expand Up @@ -293,7 +293,7 @@ Create a schema file containing external model schemas.
%table_diff [--on [ON ...]] [--skip-columns [SKIP_COLUMNS ...]]
[--model MODEL] [--where WHERE] [--limit LIMIT]
[--show-sample] [--decimals DECIMALS] [--skip-grain-check]
[--temp-schema SCHEMA]
[--temp-schema SCHEMA] [--select-model [SELECT_MODEL ...]]
SOURCE:TARGET

Show the diff between two tables.
Expand All @@ -320,6 +320,8 @@ options:
--skip-grain-check Disable the check for a primary key (grain) that is
missing or is not unique.
--temp-schema SCHEMA The schema to use for temporary tables.
--select-model <[SELECT_MODEL ...]>
Select specific models to diff using a pattern.
```

#### model
Expand Down
12 changes: 10 additions & 2 deletions sqlmesh/cli/main.py
Original file line number Diff line number Diff line change
Expand Up @@ -892,18 +892,26 @@ def create_external_models(obj: Context, **kwargs: t.Any) -> None:
type=str,
help="Schema used for temporary tables. It can be `CATALOG.SCHEMA` or `SCHEMA`. Default: `sqlmesh_temp`",
)
@click.option(
"--select-model",
"-m",
type=str,
multiple=True,
help="Specify one or more models to data diff. Use wildcards to diff multiple models. Ex: '*' (all models with applied plan diffs), 'demo.model+' (this and downstream models), 'git:feature_branch' (models with direct modifications in this branch only)",
)
@click.pass_obj
@error_handler
@cli_analytics
def table_diff(
obj: Context, source_to_target: str, model: t.Optional[str], **kwargs: t.Any
) -> None:
"""Show the diff between two tables."""
"""Show the diff between two tables or a selection of models when they are specified."""
source, target = source_to_target.split(":")
select_models = {model} if model else kwargs.pop("select_model", None)
obj.table_diff(
source=source,
target=target,
model_or_snapshot=model,
select_models=select_models,
**kwargs,
)

Expand Down
Loading