Skip to content

PERF: ArrowExtensionArray.fillna when array does not contains any nulls #51635

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 11 commits into from
Mar 17, 2023
Merged
1 change: 1 addition & 0 deletions doc/source/whatsnew/v2.1.0.rst
Original file line number Diff line number Diff line change
Expand Up @@ -102,6 +102,7 @@ Performance improvements
~~~~~~~~~~~~~~~~~~~~~~~~
- Performance improvement in :meth:`DataFrame.first_valid_index` and :meth:`DataFrame.last_valid_index` for extension array dtypes (:issue:`51549`)
- Performance improvement in :meth:`DataFrame.clip` and :meth:`Series.clip` (:issue:`51472`)
- Performance improvement in :meth:`~arrays.ArrowExtensionArray.fillna` when array does not contain nulls (:issue:`51635`)
- Performance improvement in :func:`read_parquet` on string columns when using ``use_nullable_dtypes=True`` (:issue:`47345`)
-

Expand Down
3 changes: 3 additions & 0 deletions pandas/core/arrays/arrow/array.py
Original file line number Diff line number Diff line change
Expand Up @@ -631,6 +631,9 @@ def fillna(
) -> ArrowExtensionArrayT:
value, method = validate_fillna_kwargs(value, method)

if not self._hasna:
return self.copy()

if limit is not None:
return super().fillna(value=value, method=method, limit=limit)

Expand Down
4 changes: 2 additions & 2 deletions pandas/tests/extension/test_arrow.py
Original file line number Diff line number Diff line change
Expand Up @@ -686,8 +686,8 @@ def test_fillna_no_op_returns_copy(self, data):
result = data.fillna(valid)
assert result is not data
self.assert_extension_array_equal(result, data)
with tm.assert_produces_warning(PerformanceWarning):
result = data.fillna(method="backfill")

result = data.fillna(method="backfill")
assert result is not data
self.assert_extension_array_equal(result, data)

Expand Down
5 changes: 1 addition & 4 deletions pandas/tests/extension/test_string.py
Original file line number Diff line number Diff line change
Expand Up @@ -165,10 +165,7 @@ def test_fillna_no_op_returns_copy(self, data):
assert result is not data
self.assert_extension_array_equal(result, data)

with tm.maybe_produces_warning(
PerformanceWarning, data.dtype.storage == "pyarrow"
):
result = data.fillna(method="backfill")
result = data.fillna(method="backfill")
assert result is not data
self.assert_extension_array_equal(result, data)

Expand Down