Skip to content

BUG: Timestamp parsing creating invalid object #50668

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 5 commits into from
Jan 13, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions doc/source/whatsnew/v2.0.0.rst
Original file line number Diff line number Diff line change
Expand Up @@ -909,6 +909,8 @@ Timezones
- Bug in :meth:`Series.astype` and :meth:`DataFrame.astype` with object-dtype containing multiple timezone-aware ``datetime`` objects with heterogeneous timezones to a :class:`DatetimeTZDtype` incorrectly raising (:issue:`32581`)
- Bug in :func:`to_datetime` was failing to parse date strings with timezone name when ``format`` was specified with ``%Z`` (:issue:`49748`)
- Better error message when passing invalid values to ``ambiguous`` parameter in :meth:`Timestamp.tz_localize` (:issue:`49565`)
- Bug in string parsing incorrectly allowing a :class:`Timestamp` to be constructed with an invalid timezone, which would raise when trying to print (:issue:`50668`)
-

Numeric
^^^^^^^
Expand Down
13 changes: 13 additions & 0 deletions pandas/_libs/tslibs/parsing.pyx
Original file line number Diff line number Diff line change
Expand Up @@ -304,6 +304,19 @@ def parse_datetime_string(
raise OutOfBoundsDatetime(
f'Parsing "{date_string}" to datetime overflows'
) from err
if dt.tzinfo is not None:
# dateutil can return a datetime with a tzoffset outside of (-24H, 24H)
# bounds, which is invalid (can be constructed, but raises if we call
# str(dt)). Check that and raise here if necessary.
try:
dt.utcoffset()
except ValueError as err:
# offset must be a timedelta strictly between -timedelta(hours=24)
# and timedelta(hours=24)
raise ValueError(
f'Parsed string "{date_string}" gives an invalid tzoffset, '
"which must be between -timedelta(hours=24) and timedelta(hours=24)"
)

return dt

Expand Down
10 changes: 10 additions & 0 deletions pandas/tests/scalar/timestamp/test_constructors.py
Original file line number Diff line number Diff line change
Expand Up @@ -24,6 +24,16 @@


class TestTimestampConstructors:
def test_construct_from_string_invalid_raises(self):
# dateutil (weirdly) parses "200622-12-31" as
# datetime(2022, 6, 20, 12, 0, tzinfo=tzoffset(None, -111600)
# which besides being mis-parsed, is a tzoffset that will cause
# str(ts) to raise ValueError. Ensure we raise in the constructor
# instead.
# see test_to_datetime_malformed_raise for analogous to_datetime test
with pytest.raises(ValueError, match="gives an invalid tzoffset"):
Timestamp("200622-12-31")

def test_constructor_from_iso8601_str_with_offset_reso(self):
# GH#49737
ts = Timestamp("2016-01-01 04:05:06-01:00")
Expand Down
10 changes: 6 additions & 4 deletions pandas/tests/tools/test_to_datetime.py
Original file line number Diff line number Diff line change
Expand Up @@ -1561,12 +1561,14 @@ def test_to_datetime_malformed_no_raise(self, errors, expected):
def test_to_datetime_malformed_raise(self):
# GH 48633
ts_strings = ["200622-12-31", "111111-24-11"]
msg = (
'Parsed string "200622-12-31" gives an invalid tzoffset, which must '
r"be between -timedelta\(hours=24\) and timedelta\(hours=24\), "
"at position 0"
)
with pytest.raises(
ValueError,
match=(
r"^offset must be a timedelta strictly between "
r"-timedelta\(hours=24\) and timedelta\(hours=24\)., at position 0$"
),
match=msg,
):
with tm.assert_produces_warning(
UserWarning, match="Could not infer format"
Expand Down