Skip to content

ChronoFormatWarning Should Raise Error as it Will Result in Malformed Timestamps #10985

@jayceslesar

Description

@jayceslesar

Checks

  • I have checked that this issue has not already been reported.

  • I have confirmed this bug exists on the latest version of Polars.

Reproducible example

from datetime import datetime, timezone

import polars as pl

data = {
    "timestamp": [
        "2023-08-29 19:15:32.000",
        "2023-08-29 19:15:32.000",
        "2023-08-29 19:15:32.010",
        "2023-08-29 19:15:32.010",
    ],
    "field": [
        "sensor1",
        "sensor1",
        "sensor1",
        "sensor1",
    ],
    "value": [3, 4, 5, 6],
}

df = pl.DataFrame(data)
print(df)
"""
shape: (4, 3)
┌─────────────────────────┬─────────┬───────┐
│ timestamp               ┆ field   ┆ value │
│ ---                     ┆ ---     ┆ ---   │
│ str                     ┆ str     ┆ i64   │
╞═════════════════════════╪═════════╪═══════╡
│ 2023-08-29 19:15:32.000 ┆ sensor1 ┆ 3     │
│ 2023-08-29 19:15:32.000 ┆ sensor1 ┆ 4     │
│ 2023-08-29 19:15:32.010 ┆ sensor1 ┆ 5     │
│ 2023-08-29 19:15:32.010 ┆ sensor1 ┆ 6     │
└─────────────────────────┴─────────┴───────┘
"""

df = pl.DataFrame(data).with_columns(
    pl.col("timestamp")
    .str.strptime(pl.Datetime, format="%Y-%m-%d %H:%M:%S.%f")
    .cast(pl.Datetime(time_unit="ns", time_zone=timezone.utc))
)
"""
ChronoFormatWarning: Detected the pattern `.%f` in the chrono format string.
This pattern should not be used to parse values after a decimal point. Use `%.f` instead.
See the full specification: https://docs.rs/chrono/latest/chrono/format/strftime
"""
print(df)
"""
shape: (4, 3)
┌───────────────────────────────────┬─────────┬───────┐
│ timestamp                         ┆ field   ┆ value │
│ ---                               ┆ ---     ┆ ---   │
│ datetime[ns, UTC]                 ┆ str     ┆ i64   │
╞═══════════════════════════════════╪═════════╪═══════╡
│ 2023-08-29 19:15:32 UTC           ┆ sensor1 ┆ 3     │
│ 2023-08-29 19:15:32 UTC           ┆ sensor1 ┆ 4     │
│ 2023-08-29 19:15:32.000000010 UT… ┆ sensor1 ┆ 5     │
│ 2023-08-29 19:15:32.000000010 UT… ┆ sensor1 ┆ 6     │
└───────────────────────────────────┴─────────┴───────┘
"""

df = pl.DataFrame(data).with_columns(
    pl.col("timestamp")
    .str.strptime(pl.Datetime, format="%Y-%m-%d %H:%M:%S%.f")
    .cast(pl.Datetime(time_unit="ns", time_zone=timezone.utc))
)
print(df)
"""
shape: (4, 3)
┌─────────────────────────────┬─────────┬───────┐
│ timestamp                   ┆ field   ┆ value │
│ ---                         ┆ ---     ┆ ---   │
│ datetime[ns, UTC]           ┆ str     ┆ i64   │
╞═════════════════════════════╪═════════╪═══════╡
│ 2023-08-29 19:15:32 UTC     ┆ sensor1 ┆ 3     │
│ 2023-08-29 19:15:32 UTC     ┆ sensor1 ┆ 4     │
│ 2023-08-29 19:15:32.010 UTC ┆ sensor1 ┆ 5     │
│ 2023-08-29 19:15:32.010 UTC ┆ sensor1 ┆ 6     │
└─────────────────────────────┴─────────┴───────┘
"""

Log output

No response

Issue description

The warning passes, and all operations that should work do, so if this goes unnoticed it is pretty tricky to trace back to this one sort of "conflict" between how chrono and datetime behave.

Expected behavior

ChronoFormatWarning should raise an error as it results in Polars creating a malformed timestamp (as shown in the example above). Downstream tools that then use the file produced by polars in this case will fail with obscure warnings (was able to not read the resulting parquet file with the malformed timestamp in R or Matlab)

Installed versions

--------Version info---------
Polars:              0.19.2
Index type:          UInt32
Platform:            macOS-13.5.1-arm64-arm-64bit
Python:              3.10.6 (main, Oct  3 2022, 19:34:55) [Clang 13.1.6 (clang-1316.0.21.2.5)]

----Optional dependencies----
adbc_driver_sqlite:  <not installed>
cloudpickle:         <not installed>
connectorx:          <not installed>
deltalake:           <not installed>
fsspec:              2023.3.0
matplotlib:          <not installed>
numpy:               1.24.3
pandas:              1.5.1
pyarrow:             10.0.1
pydantic:            1.10.7
sqlalchemy:          1.4.48
xlsx2csv:            <not installed>
xlsxwriter:          <not installed>

Metadata

Metadata

Assignees

No one assigned

    Labels

    A-temporalArea: date/time functionalitybugSomething isn't workingpythonRelated to Python Polars

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions