-
Notifications
You must be signed in to change notification settings - Fork 118
Description
Describe the bug
Most but not all fields initially encoded as strings are actually categorical. When they are categorical, conversion to an explicit categorical type is efficient. However, if they are not categorical (e.g. escort tour participants) or are loosely categorical but with potentially a lot of categories (vehicle type / age / fuel), the conversion to explicit categorical is not efficient.
In particular, converting non-categorical data to categorical ruins sharrow performance by triggering excessive recompiling, because every different categorical encoding is treated as a unique data type. This means, for example, if a "categorical" escort tour participants data column appears in a chooser table, then re-compiling will happen basically every time the model runs.
A fix will require not converting these fields to categorical data types.
This is quite possibly the problem in #756.