-
Notifications
You must be signed in to change notification settings - Fork 118
Expression profiling (legacy mode) #936
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR implements a new performance profiling feature that enables detailed tracking and logging of expression evaluation runtimes in ActivitySim.
- Integrated a performance timer via the EvalTiming class.
- Extended core functions to wrap expression evaluations with performance measurement.
- Added new configuration settings and documentation to support and explain expression profiling.
Reviewed Changes
Copilot reviewed 11 out of 11 changed files in this pull request and generated no comments.
Show a summary per file
| File | Description |
|---|---|
| docs/users-guide/performance/index.md | Added a link to the new Expression Profiling section. |
| docs/users-guide/performance/expr-profiling.md | Created documentation explaining expression profiling. |
| activitiesim/core/simulate.py | Integrated performance timing in expression evaluation. |
| activitiesim/core/interaction_simulate.py | Enabled profiling timer for interaction simulations. |
| activitiesim/core/expressions.py | Passed trace_label parameter to compute_columns. |
| activitiesim/core/configuration/top.py | Added new profiling settings and descriptive docstrings. |
| activitiesim/core/configuration/base.py | Added a performance_log configuration field. |
| activitiesim/core/assign.py | Wrapped several expression evaluations with profiling. |
| activitiesim/cli/run.py | Generated reporting on profiling summary data. |
| activitiesim/abm/models/summarize.py | Integrated performance timing in summarizing expressions. |
Comments suppressed due to low confidence (1)
activitysim/core/assign.py:250
- The variable 'trace_label' is used to create the performance log filename but is not defined in the current scope. Consider updating the function signature to accept a 'trace_label' parameter or use an alternative identifier available in context.
if state.settings.expression_profile:
|
Example outputs from the SANDAG test model: Or: Linked here. Note: if you want to see the results, you'll need to click the first links, or download the HTML files, the content is rendered as embedded JavaScript and the OneDrive preview probably won't show anything. |
dhensle
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
|
|
||
| Profiling expressions also adds some overhead to the model run, so users should | ||
| be careful about using the profiler in production runs. It is recommended turn off | ||
| the profiler in production runs, and only use it for debugging and development. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we have a rough estimate for this to add to the docs? For example, just say that for the full SANDAG model, this increased the runtime by XX amount.
|
In order to test this, do I need to add any other settings beyond: Thanks! |
|
Ignore my question above, I figured out why I was not seeing any outputs. |
|
@jpn--, I ran into the following error when testing this branch in mc logsum in school location: Our model has been tested to run on asim v1.3.4 and v1.4.0. Any suggestions? Let me know if you need anything from me to diagnose this issue. |
|
I ran into the same issue that @dhensle describes above. |
|
@jpn-- , thanks for the suggestion at our meeting today. It turns out my previous error was indeed due to file path being too long. I moved my working folder to C:/model folder and it finished successfully. A solution to the path issue, and could also make the expression profiling more user friendly is to move all the logs into one csv file. That way, we can filter, sort and review what expression in what model are taking the most time. Similar to the One potential issue with exporting the timing and expression to csv is that many expressions may contain commas, thus causing potentially messy formatting for csv. We could clean up all the commas in the expressions to other characters like Here is an example output to illustrate what the csv file could look like: |
Sounds good to me. Saves the user from having to generate multiple output directories when testing/comparing different expressions. |
Sounds good to me too. |
|
Should now be working as described above, with a timestamped log file. |
|
I can confirm the timestamped outputs as described above for me. I have no issues with the code changes either. I think the only outstanding issue is the additional runtime question above. |
|
I was finally able to get around to testing it on my end. It seemed to work just fine, though I did get the screenshotted error at the end. I'm not sure if that's related (and all of the outputs were written). I have attached one of the outputs in case anyone would like to review it to confirm that it looks correct. |
|
|
@jpn-- I can also do a runtime test. My tests so far have been with our small set of inputs, which runs in about 12 minutes, single process. I did do a single process run with a 10 percent sample using the full inputs. I can do this with expression profiling turned off for a comparison. We usually run 100 percent samples with MP, but I imagine that should be off for this test? I tested the timestamped output folder and it works as expected. |
|
@jpn-- I just reran the test on the same server using the same environment but setting
I also shared the results with @bhargavasana, and he agreed with me that it would be more useful if the process name were included in the folder name and not just the timestamp for when it was created. It took me a few tries to find the file I shared in my last comment. |
|
Oh, we should not have all those timestamped directories... that's a bug I will fix. There should only be a single timestamp for the entire multiprocessor run, stamped at the beginning of that run by the primary process. All the subprocesses should write into subdirectories of that one timestamped directory. As for the error: it's clearly an error in the tool that merges the logs into a pair of html analysis outputs, the component and subcomponent reports. These are two output files in addition to all the other outputs you expect from an ActivitySim run, and I am 99% sure your first attempt did not produce these two final summary files. It didn't crash for the second run because it didn't run the expression profiling summary step where the error lives. And: users should never have expression profiling enabled in "application" or "production" mode. This is a developer's tool, and your results show us that it costs about 13% more runtime overall. Only developers who will actually look at the performance results and try to tune the expressions to run faster should be using it. |
|
Ah, I had misread the error message. About a month ago we had an error in application when the pipeline files were being written and I didn't look very closely and assumed it was the same thing. |
|
@jpn-- Did the naming of the files work as intended? |
|
@JoeJimFlood Yes, your most recent message looks correct. |
|
@jpn-- I think the updated text from Joe's full-scale sandag testing need to be added still. |
|
@dhensle I made the update to the docs in this PR |
|
approved, thanks @jpn-- |
* expression profiling * two reports * make simple report sortable * change default to grid * configurable expression_profile_cutoff * works with MP * documentation * expression_profile_style * performance_log can be bool * fix for pandas 2 * test expr profiling * test expression profiling on mtc and semcog examples * set encoding to utf8 because windows * add missing import * add warning about path length in error message * oops, .log not .csv * add timestamping to expression profile logs * fix file naming, update tests * all subprocesses to same timestamp * add note in docs about runtime cost
* Expression profiling (legacy mode) (#936) * expression profiling * two reports * make simple report sortable * change default to grid * configurable expression_profile_cutoff * works with MP * documentation * expression_profile_style * performance_log can be bool * fix for pandas 2 * test expr profiling * test expression profiling on mtc and semcog examples * set encoding to utf8 because windows * add missing import * add warning about path length in error message * oops, .log not .csv * add timestamping to expression profile logs * fix file naming, update tests * all subprocesses to same timestamp * add note in docs about runtime cost * manual rebuild of docs * add permissions





Addresses #906. This pull request introduces a new performance profiling feature for expression evaluation in the ActivitySim framework. The feature allows developers to track and log the runtime of individual expressions, providing insights into potential bottlenecks in complex models. Key changes include the integration of a performance timer, updates to various core functions to support profiling, and new configuration settings for controlling profiling behavior.
Performance Profiling Feature
Integration of Performance Timer:
EvalTimingclass to measure and log the execution time of expressions. The timer is initialized conditionally based on the newexpression_profilesetting. (activitysim/abm/models/summarize.py,activitysim/core/assign.py,activitysim/core/interaction_simulate.py,activitysim/core/simulate.py) [1] [2] [3] [4]Expression Timing Wrappers:
performance_timer.time_expressionto measure the execution time of individual expressions. (activitysim/core/assign.py,activitysim/core/interaction_simulate.py,activitysim/core/simulate.py) [1] [2] [3]Log Writing:
activitysim/abm/models/summarize.py,activitysim/core/assign.py,activitysim/core/interaction_simulate.py,activitysim/core/simulate.py) [1] [2] [3] [4]Configuration Updates
New Profiling Settings:
expression_profileandexpression_profile_cutoffsettings to enable/disable profiling globally or for specific components, and to filter out expressions based on runtime thresholds. (activitysim/core/configuration/top.py)Documentation for Profiling:
activitysim/core/configuration/base.py,activitysim/core/configuration/top.py) [1] [2]Core Function Enhancements
Trace Label Support:
trace_labelparameter for more granular logging and profiling. (activitysim/core/simulate.py) [1] [2]Performance Reporting:
activitysim/cli/run.py)This enhancement is primarily aimed at developers and advanced users who need to optimize model performance. While it introduces some overhead during execution, it provides valuable insights for debugging and improving complex expressions.