-
-
Notifications
You must be signed in to change notification settings - Fork 18.5k
Document Using Regex for str.split #25296
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
This is not a bug as you would need to escape the plus sign if using a regular expression. That said, this feature is not documented so I think we can re-purpose this issue to actually document support for regex splitting |
The behavior is inconsistent though as it seems # this works:
df.col.str.split(',|#|=|-', expand=True)
# this does not:
df.col.str.split(',|#|=|-|+', expand=True)
# and you have to
df.col.str.split(',|#|=|-|\+', expand=True) |
It's consistent with regex behavior where |
I can work on putting this in the documentation. Would you be okay with localized documentation in all of the str methods where this is applicable? |
@zangell44 I think it is documented in most methods but sure if you see others where it isn't by all means include in a PR |
Problem description
While passing two patterns separating with
|
tostr.split()
method, if one of them is+
,panads
returns the following error:INSTALLED VERSIONS
commit: None
python: 3.6.8.final.0
python-bits: 64
OS: Windows
OS-release: 10
machine: AMD64
processor: Intel64 Family 6 Model 142 Stepping 10, GenuineIntel
byteorder: little
LC_ALL: None
LANG: None
LOCALE: None.None
pandas: 0.23.4
pytest: 3.7.1
pip: 18.1
setuptools: 40.2.0
Cython: 0.29.2
numpy: 1.15.4
scipy: 1.2.0
pyarrow: None
xarray: 0.11.0
IPython: 7.1.1
sphinx: 1.7.6
patsy: 0.5.1
dateutil: 2.7.3
pytz: 2018.5
blosc: None
bottleneck: 1.2.1
tables: 3.4.3
numexpr: 2.6.9
feather: None
matplotlib: 3.0.2
openpyxl: 2.5.5
xlrd: 1.1.0
xlwt: 1.3.0
xlsxwriter: 1.0.5
lxml: 4.2.4
bs4: 4.7.1
html5lib: 1.0.1
sqlalchemy: 1.2.10
pymysql: None
psycopg2: 2.7.6.1 (dt dec pq3 ext lo64)
jinja2: 2.10
s3fs: None
fastparquet: None
pandas_gbq: None
pandas_datareader: None
The text was updated successfully, but these errors were encountered: