Skip to content

Dealing with if scalar #315

Closed
Closed
@MarcoGorelli

Description

@MarcoGorelli

The whole discussion around to_array is quite tricky, see #294 and #307 . One big difficulty is that for some libraries it can stay lazy (e.g. Dask has a lazy array), whereas for others it can't (polars LazyFrame doesn't have a to_numpy attribute)

Maybe we can temporarily park it, and try to address the more important (arguably) issue of what to do about

df: DataFrame
features = []
for column_name in df.column_names:
    if df.col(column_name).std() > 0:
        features.append(column_name)
return features

Because as far as I can tell, this call is problematic for all libraries other than purely eager ones. Even Dask, which was mentioned in #294 as an example of a library which can stay lazy in to_array, raises in the call above (see here).

Dask raises here, it doesn't do any implicit computation.

So...what do we do here? Maybe let's try resolving this one, and then return to to_array?

I'll hold off making suggestions this time, let's let the discussion roll

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions