Skip to content

[Feature] Input Checker #678

@dhensle

Description

@dhensle

RSG is developing an input checker as part of the phase 8 work.

Input checker features
The input checker is a series of checks to be run on ActivitySim inputs (synthetic population, landuse, skims) to try to catch problems in the input data that might otherwise crash ActivitySim downstream or lead to bad model results. The input checker will be run as the first "model" in an ActivitySim run and should run quickly before starting any subsequent ActivitySim sub-models.

Original Design
The original design of the input checker aligns with the current paradigm of ActivitySim configuration files: there is a csv file that contains a list of python expressions that evaluate to True or False to pass the check. (For more details, see the presentations on Feb 16 and April 20).

A new proposal
The ActivitySim consortium has been discussing the possibility of adding data model to the ActivitySim ecosystem. This data model (see #617) would leverage the pydantic and pandera packages to enumerate allowed values and provide documentation on what each data field represents. Instead of creating an input checker in line with the original design, a different approach would be to leverage the data model. The input checker code would then just validate the input data against what is available in the data model.

If moving towards the data model approach, RSG would implement the input checker and focus on the input side of the data model integration. This would include writing validator functions and checks in the data model as opposed to the csv "spec" file in the original approach. Both approaches would be fundamentally similar in function -- the data model would still be validating the input data via a series of checks as defined by the user.

Discussion of pros & cons between the original design and the new approach took place at the May 4 and May 11 meetings. Please view the meeting notes and slides on those meeting pages for more details and in-depth discussion.

Deadline for decision
As decided in today's meeting, we are requesting further discussion and questions to be in before a decision is made at the May 18 meeting.

Metadata

Metadata

Assignees

No one assigned

    Labels

    FeatureNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions