BUG: Large dataset exception on arrow string operations

### Pandas version checks

- [X] I have checked that this issue has not already been reported.

- [X] I have confirmed this bug exists on the [latest version](https://pandas.pydata.org/docs/whatsnew/index.html) of pandas.

- [X] I have confirmed this bug exists on the [main branch](https://pandas.pydata.org/docs/dev/getting_started/install.html#installing-the-development-version-of-pandas) of pandas.


### Reproducible Example

```python
import pandas as pd
df = pd.read_parquet("mypath/myfile.parquet", engine="pyarrow")
df = df.convert_dtypes(dtype_backend="pyarrow")
len(df[df.full_path.str.endswith("90_WW") ])
```


### Issue Description

On large datasets (24 million rows) getting "pyarrow.lib.ArrowInvalid: offset overflow while concatenating arrays error" when performing string operations. with Python 3.11 and pandas 2.2.2. Seems to be related to this pervious issue: https://github.com/pandas-dev/pandas/issues/55606

```python
Traceback (most recent call last):
  File "C:\Users\corey\Analytics_Software\pycharm\PyCharm 2024.1.1\plugins\python\helpers\pydev\pydevconsole.py", line 364, in runcode
    coro = func()
           ^^^^^^
  File "<input>", line 1, in <module>
  File "C:\Users\corey\Miniconda3\envs\analytics311\Lib\site-packages\pandas\core\frame.py", line 4093, in __getitem__
    return self._getitem_bool_array(key)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\corey\Miniconda3\envs\analytics311\Lib\site-packages\pandas\core\frame.py", line 4155, in _getitem_bool_array
    return self._take_with_is_copy(indexer, axis=0)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\corey\Miniconda3\envs\analytics311\Lib\site-packages\pandas\core\generic.py", line 4153, in _take_with_is_copy
    result = self.take(indices=indices, axis=axis)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\corey\Miniconda3\envs\analytics311\Lib\site-packages\pandas\core\generic.py", line 4133, in take
    new_data = self._mgr.take(
               ^^^^^^^^^^^^^^^
  File "C:\Users\corey\Miniconda3\envs\analytics311\Lib\site-packages\pandas\core\internals\managers.py", line 894, in take
    return self.reindex_indexer(
           ^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\corey\Miniconda3\envs\analytics311\Lib\site-packages\pandas\core\internals\managers.py", line 687, in reindex_indexer
    new_blocks = [
                 ^
  File "C:\Users\corey\Miniconda3\envs\analytics311\Lib\site-packages\pandas\core\internals\managers.py", line 688, in <listcomp>
    blk.take_nd(
  File "C:\Users\corey\Miniconda3\envs\analytics311\Lib\site-packages\pandas\core\internals\blocks.py", line 1307, in take_nd
    new_values = algos.take_nd(
                 ^^^^^^^^^^^^^^
  File "C:\Users\corey\Miniconda3\envs\analytics311\Lib\site-packages\pandas\core\array_algos\take.py", line 114, in take_nd
    return arr.take(indexer, fill_value=fill_value, allow_fill=allow_fill)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\corey\Miniconda3\envs\analytics311\Lib\site-packages\pandas\core\arrays\arrow\array.py", line 1309, in take
    return type(self)(self._pa_array.take(indices))
                      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "pyarrow\\table.pxi", line 1052, in pyarrow.lib.ChunkedArray.take
  File "C:\Users\corey\Miniconda3\envs\analytics311\Lib\site-packages\pyarrow\compute.py", line 487, in take
    return call_function('take', [data, indices], options, memory_pool)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "pyarrow\\_compute.pyx", line 590, in pyarrow._compute.call_function
  File "pyarrow\\_compute.pyx", line 385, in pyarrow._compute.Function.call
  File "pyarrow\\error.pxi", line 155, in pyarrow.lib.pyarrow_internal_check_status
  File "pyarrow\\error.pxi", line 92, in pyarrow.lib.check_status
pyarrow.lib.ArrowInvalid: offset overflow while concatenating arrays
```

### Expected Behavior

string operations work with arrow backend and large datasets

### Installed Versions

INSTALLED VERSIONS
------------------
commit                : d9cdd2ee5a58015ef6f4d15c7226110c9aab8140
python                : 3.11.9.final.0
python-bits           : 64
OS                    : Windows
OS-release            : 10
Version               : 10.0.22621
machine               : AMD64
processor             : Intel64 Family 6 Model 165 Stepping 2, GenuineIntel
byteorder             : little
LC_ALL                : None
LANG                  : None
LOCALE                : English_United States.1252
pandas                : 2.2.2
numpy                 : 2.1.0
pytz                  : 2024.1
dateutil              : 2.9.0.post0
setuptools            : 72.1.0
pip                   : 24.2
Cython                : None
pytest                : None
hypothesis            : None
sphinx                : None
blosc                 : None
feather               : None
xlsxwriter            : None
lxml.etree            : None
html5lib              : None
pymysql               : None
psycopg2              : None
jinja2                : None
IPython               : None
pandas_datareader     : None
adbc-driver-postgresql: None
adbc-driver-sqlite    : None
bs4                   : None
bottleneck            : None
dataframe-api-compat  : None
fastparquet           : None
fsspec                : None
gcsfs                 : None
matplotlib            : None
numba                 : None
numexpr               : None
odfpy                 : None
openpyxl              : None
pandas_gbq            : None
pyarrow               : 17.0.0
pyreadstat            : None
python-calamine       : None
pyxlsb                : None
s3fs                  : None
scipy                 : None
sqlalchemy            : None
tables                : None
tabulate              : None
xarray                : None
xlrd                  : None
zstandard             : None
tzdata                : 2024.1
qtpy                  : None
pyqt5                 : None


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

BUG: Large dataset exception on arrow string operations #59752

Pandas version checks

Reproducible Example

Issue Description

Expected Behavior

Installed Versions

INSTALLED VERSIONS

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

BUG: Large dataset exception on arrow string operations #59752

Description

Pandas version checks

Reproducible Example

Issue Description

Expected Behavior

Installed Versions

INSTALLED VERSIONS

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions