Skip to content

Conversation

@jpn--
Copy link
Member

@jpn-- jpn-- commented May 12, 2025

The SkimDataset structure requires every variable to have a unique name. It also merges OMX variables based on time period, so that e.g. BIKETIME__AM and BIKETIME__PM, which would be 2-d arrays in the OMX file, become just two different parts of a 3-d array called BIKETIME in the SkimDataset. This is problematic when the skims also contain a 2-d array called BIKETIME, as that has no temporal dimension, and it gets loaded into a 2-d array in the SkimDataset, with the same name as the 3-d array, and thus one is overwritten and lost.

A workaround was added via the omx_ignore_patterns setting in #867, which allows the user to not load certain skims, but this relies on the model developer or user to set this up correctly.

This PR includes a skims input check to identify this overwriting condition, and raise an error if it is happening, so that the user can correct the condition via (1) the omx_ignore_patterns setting, (2) revising the skim generation process to not create the overlapping named skims in the file in the first place, or (3) renaming one or both skims if the users actually wants both skims variables in the model. The error message generated includes a link to instructions and discussion of these alternatives.

Enhancements to Skim Dataset Handling:

  • Added conflict detection for skim variables with both time-dependent and time-agnostic versions. If conflicts are found, the system now raises an error with detailed instructions for resolution. (activitysim/core/skim_dataset.py, activitysim/core/skim_dict_factory.py) [1] [2]
  • Introduced a skim_conflicts attribute in the skim dictionary factory to track and warn about duplicate skim names and conflicts. (activitysim/core/skim_dict_factory.py) [1] [2]
  • Enhanced checks for OMX file consistency, including verifying matrix shapes and ensuring unique skim names across files. (activitysim/core/skim_dict_factory.py) [1] [2]

Workflow and Dependency Updates:

  • Updated the GitHub Actions workflow to allow manual triggering via workflow_dispatch while maintaining existing conditions for automatic builds. (.github/workflows/branch-docs.yml)
  • Added re and defaultdict imports to support new functionality for skim conflict detection and regex-based ignore rules. (activitysim/core/skim_dataset.py, activitysim/core/skim_dict_factory.py) [1] [2]

Documentation Updates:

  • Added a new section on skims to the user guide, explaining their structure, usage, and naming conventions. Highlighted the deprecation of the three-zone system and provided recommendations for resolving skim conflicts. (docs/users-guide/model_anatomy.rst) [1] [2]

These changes aim to improve the robustness of skim data handling, provide clearer guidance to users, and ensure compatibility with future updates.

@jpn-- jpn-- requested a review from dhensle May 13, 2025 18:15
Copy link
Contributor

@dhensle dhensle left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code looks great and the error message output is fantastic. However, this functionality is missing a unit test. I suggest 3 parts to the CI test using omx files with duplicate names: (1) data should load but emit warning if not sharrow, (2) run should crash if using sharrow, (3) run should continue if using sharrow but omx_ignore_patterns is set correctly.

@jpn--
Copy link
Member Author

jpn-- commented Jul 29, 2025

@dhensle I added CI testing exactly per your recommendation. Thanks!

@jpn-- jpn-- requested a review from dhensle July 29, 2025 16:32
Copy link
Contributor

@dhensle dhensle left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great, thanks Jeff!

@jpn-- jpn-- merged commit 1027026 into ActivitySim:main Jul 29, 2025
16 of 17 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants