Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
68 commits
Select commit Hold shift + click to select a range
44e3c21
multiprocess initial commit
dhensle Aug 17, 2024
9b29350
blacken
dhensle Aug 17, 2024
3434c95
parquet format for EDBs
dhensle Sep 6, 2024
914b9ca
adding pkl, fixing edb concat and write
dhensle Sep 20, 2024
d2e181f
fixing double naming of coefficient files
dhensle Sep 23, 2024
c138f0f
blacken
dhensle Sep 23, 2024
6d35f9f
fixing missing cdap coefficients file, write pickle function
dhensle Sep 23, 2024
27c4ce4
combact edb writing, index duplication, parquet datatypes
dhensle Sep 24, 2024
cd3d07e
sorting dest choice bundles
dhensle Sep 25, 2024
8a1fa3c
adding coalesce edbs as its own step
dhensle Sep 25, 2024
e8c03e6
CI testing initial commit
dhensle Sep 27, 2024
fe625e2
Merge pull request #1 from dhensle/estimation_enhancements
dhensle Sep 30, 2024
8d80e2e
infer.py CI testing
dhensle Oct 7, 2024
1459e48
estimation sampling for non-mandatory and joint tours
dhensle Oct 8, 2024
3fd7851
adding survey choice to choices_df in interaction_sample
dhensle Oct 12, 2024
23ba662
adding option to delete the mp edb subdirs
dhensle Oct 15, 2024
0a1bd5c
changes supporting sandag abm3 estimation mode
dhensle Oct 21, 2024
8a4b281
running test sandag example through trip dest sample
dhensle Nov 7, 2024
6a50abb
Estimation Pydantic (#2)
jpn-- Nov 7, 2024
45ee4e8
Estimation settings pydantic update
dhensle Nov 8, 2024
4af3fa9
new compact formatting
dhensle Nov 12, 2024
36dfb45
handling multiple columns for parquet write
dhensle Nov 12, 2024
e4eb045
dropping duplicate columns
dhensle Nov 22, 2024
b2972cc
actually removing duplicate columns
dhensle Nov 22, 2024
8d4dd37
dfs with correct indexes and correct mp sorting
dhensle Nov 23, 2024
1fb41a8
ignore index on sort for mp coalesce edbs
dhensle Nov 23, 2024
87b414f
updating estimation checks to allow for non-zero household_sample_size
dhensle Dec 5, 2024
3b4974c
Re-estimation (#3)
jpn-- Dec 9, 2024
aa874f6
Removing estimation.yaml settings that are no longer needed
dhensle Dec 14, 2024
a5e137b
Merge remote-tracking branch 'upstream/main' into estimation_enhancem…
dhensle Dec 14, 2024
af7e67e
fixing unit tests, setting parquet edb default
dhensle Dec 14, 2024
99822ca
one more missed estimation.yaml
dhensle Dec 14, 2024
1777637
using df.items for pandas 2 compatibility
dhensle Dec 14, 2024
420ed8e
tidy doc
jpn-- Dec 17, 2024
44bf037
updating edb file name for NMTF
dhensle Dec 21, 2024
8bccf2f
updating numba and pandas in the conda env files
dhensle Dec 26, 2024
7e59bd3
Improve test stability (#4)
jpn-- May 15, 2025
04c43a3
Merge remote-tracking branch 'upstream/main' into estimation_enhancem…
dhensle May 15, 2025
ed3ee7f
handling missing data or availability conditions
dhensle May 15, 2025
4ed400e
add docs on locking size terms
jpn-- May 20, 2025
64cd4b7
Merge branch 'main' into estimation_enhancements
jpn-- May 20, 2025
8f3439c
include constants in CDAP
jpn-- May 27, 2025
c2f8570
bump larch requirement
jpn-- May 28, 2025
cdc9752
require larch 6.0.40
jpn-- May 29, 2025
6f1b66a
add xlsxwriter to envs
jpn-- May 29, 2025
f08dc51
require larch 6.0.41
jpn-- May 31, 2025
495db41
Merge branch 'main' into estimation_enhancements
jpn-- May 31, 2025
aa8091a
add links
jpn-- Jun 5, 2025
c082361
fix typos and formatting
jpn-- Jun 5, 2025
c1fed67
cdap hh and per parquet read match csv
dhensle Jun 11, 2025
689e3f6
add missing x_validator for mode choice and nonmand tour freq
jpn-- Jun 26, 2025
cf7f7ee
add tour mode choice edit example
jpn-- Jun 26, 2025
5d19936
add to docs
jpn-- Jun 26, 2025
42c007e
union not addition on sets
jpn-- Jun 26, 2025
c2742a4
restore nb kernel
jpn-- Jun 26, 2025
d6c189d
Merge branch 'main' into estimation_enhancements
jpn-- Jul 17, 2025
3658d4f
Merge branch 'main' into estimation_enhancements
jpn-- Jul 21, 2025
9433a50
blacken
dhensle Jul 22, 2025
d279c83
replacing conda with uv in estimation tests
dhensle Jul 24, 2025
19d2bb1
add requests to github-action dependencies
dhensle Jul 24, 2025
f50122a
running with created virtual env instead
dhensle Jul 25, 2025
aa5f200
Fix estimation notebook tests (#8)
jpn-- Aug 12, 2025
5ada48d
Merge branch 'main' into estimation_enhancements
jpn-- Aug 13, 2025
618341a
Merge branch 'main' into estimation_enhancements
jpn-- Aug 13, 2025
c7a5474
Merge branch 'main' into estimation_enhancements
jpn-- Aug 25, 2025
744fc3c
Update scheduling.py
dhensle Sep 23, 2025
2a0c5b8
Merge branch 'main' into estimation_enhancements
jpn-- Sep 25, 2025
d03abac
Merge branch 'main' into estimation_enhancements
jpn-- Sep 26, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
79 changes: 78 additions & 1 deletion .github/workflows/core_tests.yml
Original file line number Diff line number Diff line change
Expand Up @@ -321,6 +321,83 @@ jobs:
run: |
uv run pytest activitysim/estimation/test/test_larch_estimation.py --durations=0

estimation_notebooks:
needs: foundation
env:
python-version: "3.10"
label: win-64
defaults:
run:
shell: pwsh
name: Estimation Notebooks Test
runs-on: windows-latest
steps:
- uses: actions/checkout@v4

- name: "Set up Python"
uses: actions/setup-python@v5
with:
python-version-file: ".python-version"

- name: Install uv
uses: astral-sh/setup-uv@v5
with:
version: "0.7.12"
enable-cache: true
cache-dependency-glob: "uv.lock"

- name: setup graphviz
uses: ts-graphviz/setup-graphviz@v2

- name: Install activitysim
run: |
uv sync --locked --group github-action

- name: Create Estimation Data
run: >
uv run --group github-action python activitysim/examples/example_estimation/notebooks/est_mode_setup.py
--household_sample_size 5000

- name: Test Estimation Notebooks
run: >
uv run --group github-action pytest activitysim/examples/example_estimation/notebooks
--nbmake-timeout=3000
--ignore=activitysim/examples/example_estimation/notebooks/01_estimation_mode.ipynb
--ignore-glob=activitysim/examples/example_estimation/notebooks/test-estimation-data/**

estimation_edb_creation:
needs: foundation
env:
python-version: "3.10"
label: win-64
defaults:
run:
shell: pwsh
name: estimation_edb_creation_test
runs-on: windows-latest
steps:
- uses: actions/checkout@v4

- name: Install uv
uses: astral-sh/setup-uv@v5
with:
version: "0.7.12"
enable-cache: true
cache-dependency-glob: "uv.lock"

- name: "Set up Python"
uses: actions/setup-python@v5
with:
python-version-file: ".python-version"

- name: Install activitysim
run: |
uv sync --locked --only-group github-action

- name: Test Estimation EDB Creation
run: |
uv run pytest activitysim/estimation/test/test_edb_creation/test_edb_formation.py --durations=0

expression-profiling:
needs: foundation
env:
Expand Down Expand Up @@ -397,4 +474,4 @@ jobs:
github_token: ${{ secrets.GITHUB_TOKEN }}
# Token is created automatically by Github Actions, no other config needed
publish_dir: ./docs/_build/html
destination_dir: develop
destination_dir: develop
4 changes: 2 additions & 2 deletions activitysim/abm/models/cdap.py
Original file line number Diff line number Diff line change
Expand Up @@ -195,7 +195,7 @@ def cdap_simulate(
estimator.write_coefficients(coefficients_df, model_settings)
estimator.write_table(
cdap_interaction_coefficients,
"interaction_coefficients",
"cdap_interaction_coefficients",
index=False,
append=False,
)
Expand All @@ -204,7 +204,7 @@ def cdap_simulate(
spec = cdap.get_cached_spec(state, hhsize)
estimator.write_table(spec, "spec_%s" % hhsize, append=False)
if add_joint_tour_utility:
joint_spec = cdap.get_cached_joint_spec(hhsize)
joint_spec = cdap.get_cached_joint_spec(state, hhsize)
estimator.write_table(
joint_spec, "joint_spec_%s" % hhsize, append=False
)
Expand Down
11 changes: 6 additions & 5 deletions activitysim/abm/models/disaggregate_accessibility.py
Original file line number Diff line number Diff line change
Expand Up @@ -764,11 +764,12 @@ def get_disaggregate_logsums(
state.filesystem, model_name + ".yaml"
)
model_settings.SAMPLE_SIZE = disagg_model_settings.DESTINATION_SAMPLE_SIZE
estimator = estimation.manager.begin_estimation(state, trace_label)
if estimator:
location_choice.write_estimation_specs(
state, estimator, model_settings, model_name + ".yaml"
)
# estimator = estimation.manager.begin_estimation(state, trace_label)
# if estimator:
# location_choice.write_estimation_specs(
# state, estimator, model_settings, model_name + ".yaml"
# )
estimator = None

# Append table references in settings with "proto_"
# This avoids having to make duplicate copies of config files for disagg accessibilities
Expand Down
5 changes: 4 additions & 1 deletion activitysim/abm/models/joint_tour_frequency.py
Original file line number Diff line number Diff line change
Expand Up @@ -192,16 +192,19 @@ def joint_tour_frequency(
print(f"len(joint_tours) {len(joint_tours)}")

different = False
# need to check households as well because the full survey sample may not be used
# (e.g. if we set household_sample_size in settings.yaml)
survey_tours_not_in_tours = survey_tours[
~survey_tours.index.isin(joint_tours.index)
& survey_tours.household_id.isin(households.index)
]
if len(survey_tours_not_in_tours) > 0:
print(f"survey_tours_not_in_tours\n{survey_tours_not_in_tours}")
different = True
tours_not_in_survey_tours = joint_tours[
~joint_tours.index.isin(survey_tours.index)
]
if len(survey_tours_not_in_tours) > 0:
if len(tours_not_in_survey_tours) > 0:
print(f"tours_not_in_survey_tours\n{tours_not_in_survey_tours}")
different = True
assert not different
Expand Down
37 changes: 2 additions & 35 deletions activitysim/abm/models/location_choice.py
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,6 @@
from activitysim.core.interaction_sample_simulate import interaction_sample_simulate
from activitysim.core.util import reindex


"""
The school/workplace location model predicts the zones in which various people will
work or attend school.
Expand Down Expand Up @@ -140,7 +139,7 @@ def _location_sample(

sample_size = model_settings.SAMPLE_SIZE

if estimator:
if estimator and model_settings.ESTIMATION_SAMPLE_SIZE >= 0:
sample_size = model_settings.ESTIMATION_SAMPLE_SIZE
logger.info(
f"Estimation mode for {trace_label} using sample size of {sample_size}"
Expand Down Expand Up @@ -423,7 +422,7 @@ def location_presample(

# choose a MAZ for each DEST_TAZ choice, choice probability based on MAZ size_term fraction of TAZ total
maz_choices = tour_destination.choose_MAZ_for_TAZ(
state, taz_sample, MAZ_size_terms, trace_label
state, taz_sample, MAZ_size_terms, trace_label, model_settings
)

assert DEST_MAZ in maz_choices
Expand Down Expand Up @@ -512,38 +511,6 @@ def run_location_sample(
trace_label=trace_label,
)

# adding observed choice to alt set when running in estimation mode
if estimator:
# grabbing survey values
survey_persons = estimation.manager.get_survey_table("persons")
if "school_location" in trace_label:
survey_choices = survey_persons["school_zone_id"].reset_index()
elif ("workplace_location" in trace_label) and ("external" not in trace_label):
survey_choices = survey_persons["workplace_zone_id"].reset_index()
else:
return choices
survey_choices.columns = ["person_id", "alt_dest"]
survey_choices = survey_choices[
survey_choices["person_id"].isin(choices.index)
& (survey_choices.alt_dest > 0)
]
# merging survey destination into table if not available
joined_data = survey_choices.merge(
choices, on=["person_id", "alt_dest"], how="left", indicator=True
)
missing_rows = joined_data[joined_data["_merge"] == "left_only"]
missing_rows["pick_count"] = 1
if len(missing_rows) > 0:
new_choices = missing_rows[
["person_id", "alt_dest", "prob", "pick_count"]
].set_index("person_id")
choices = choices.append(new_choices, ignore_index=False).sort_index()
# making probability the mean of all other sampled destinations by person
# FIXME is there a better way to do this? Does this even matter for estimation?
choices["prob"] = choices["prob"].fillna(
choices.groupby("person_id")["prob"].transform("mean")
)

return choices


Expand Down
18 changes: 14 additions & 4 deletions activitysim/abm/models/non_mandatory_tour_frequency.py
Original file line number Diff line number Diff line change
Expand Up @@ -289,14 +289,22 @@ def non_mandatory_tour_frequency(
)

if estimator:
estimator.write_spec(model_settings, bundle_directory=True)
bundle_directory = True
# writing to separte subdirectory for each segment if multiprocessing
if state.settings.multiprocess:
bundle_directory = False
estimator.write_spec(model_settings, bundle_directory=bundle_directory)
estimator.write_model_settings(
model_settings, model_settings_file_name, bundle_directory=True
model_settings,
model_settings_file_name,
bundle_directory=bundle_directory,
)
# preserving coefficients file name makes bringing back updated coefficients more straightforward
estimator.write_coefficients(coefficients_df, segment_settings)
estimator.write_choosers(chooser_segment)
estimator.write_alternatives(alternatives, bundle_directory=True)
estimator.write_alternatives(
alternatives, bundle_directory=bundle_directory
)

# FIXME #interaction_simulate_estimation_requires_chooser_id_in_df_column
# shuold we do it here or have interaction_simulate do it?
Expand Down Expand Up @@ -435,8 +443,10 @@ def non_mandatory_tour_frequency(
if estimator:
# make sure they created the right tours
survey_tours = estimation.manager.get_survey_table("tours").sort_index()
# need the household_id check below incase household_sample_size != 0
non_mandatory_survey_tours = survey_tours[
survey_tours.tour_category == "non_mandatory"
(survey_tours.tour_category == "non_mandatory")
& survey_tours.household_id.isin(persons.household_id)
]
# need to remove the pure-escort tours from the survey tours table for comparison below
if state.is_table("school_escort_tours"):
Expand Down
5 changes: 4 additions & 1 deletion activitysim/abm/models/school_escorting.py
Original file line number Diff line number Diff line change
Expand Up @@ -503,7 +503,10 @@ def school_escorting(
coefficients_df, file_name=stage.upper() + "_COEFFICIENTS"
)
estimator.write_choosers(choosers)
estimator.write_alternatives(alts, bundle_directory=True)
if state.settings.multiprocess:
estimator.write_alternatives(alts, bundle_directory=False)
else:
estimator.write_alternatives(alts, bundle_directory=True)

# FIXME #interaction_simulate_estimation_requires_chooser_id_in_df_column
# shuold we do it here or have interaction_simulate do it?
Expand Down
18 changes: 14 additions & 4 deletions activitysim/abm/models/stop_frequency.py
Original file line number Diff line number Diff line change
Expand Up @@ -191,9 +191,15 @@ def stop_frequency(

if estimator:
estimator.write_spec(segment_settings, bundle_directory=False)
estimator.write_model_settings(
model_settings, model_settings_file_name, bundle_directory=True
)
# writing to separte subdirectory for each segment if multiprocessing
if state.settings.multiprocess:
estimator.write_model_settings(
model_settings, model_settings_file_name, bundle_directory=False
)
else:
estimator.write_model_settings(
model_settings, model_settings_file_name, bundle_directory=True
)
estimator.write_coefficients(coefficients_df, segment_settings)
estimator.write_choosers(chooser_segment)

Expand Down Expand Up @@ -265,7 +271,11 @@ def stop_frequency(

survey_trips = estimation.manager.get_survey_table(table_name="trips")
different = False
survey_trips_not_in_trips = survey_trips[~survey_trips.index.isin(trips.index)]
# need the check below on household_id incase household_sample_size != 0
survey_trips_not_in_trips = survey_trips[
~survey_trips.index.isin(trips.index)
& survey_trips.household_id.isin(trips.household_id)
]
if len(survey_trips_not_in_trips) > 0:
print(f"survey_trips_not_in_trips\n{survey_trips_not_in_trips}")
different = True
Expand Down
Loading
Loading