-
-
Notifications
You must be signed in to change notification settings - Fork 18.5k
read_fwf - parsers.py PythonParser._rows_to_columns line 2814 object of type 'NoneType' has no len() #19436
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
@C5G6M7 can you write a small, reproducible example that currently fails, and submit a fix with it as a test case? |
@TomAugspurger Yes, I can work on this tonight. I encountered it on quite a big dataset running inside of a another application that uses pandas, so I'm going to have to do a bit of debugging to see how to reproduce this with a simpler input. In general though, in the above code if self.delimiter is ever None during the execution of this line it will cause an error. I did make the quick patch proposed above to my pandas installation and the problem went away. I believe it is safe to make given that the following code executed in the conditional is just an error message related to a multi-char delimiter which wouldn't be applicable anyway if the delimiter was none. However there could be another issue earlier in the code if it is always expected that either the delimiter should have a default string value assignment such as a comma so that it has always len() method or self.quoting == csv.QUOTE_NONE whenever the delimiter does not have a value with a len() method. I'm not 100% sure but it also might fix the issue by just rearranging the order of the conditionals so that "self.quoting != csv.QUOTE_NONE" is executed first so that if this evaluates to false it never checks "len(self.delimiter)" |
@TomAugspurger still working on reproducing this. I removed edits I made initially to handle this and haven't encountered the error again yet, but also it can only occur with files that have bad lines, which means the column names must be explicitly passed so that it does not automatically create the extra columns. Unfortunately I can't remember which file it was that caused this. I'm going to continue running this and as soon as I encounter a file that produces the issue again I will update this. |
Indeed, dupe of #13374 |
Line 2814 in parsers.py throws an error if self.delimiter is None:
"object of type 'NoneType' has no len()"
Here is the current line of code where the error happens:
I propose the following fix, which I believe should be a safe replacement:
Output of
pd.show_versions()
INSTALLED VERSIONS
commit: None
python: 3.5.4.final.0
python-bits: 64
OS: Windows
OS-release: 10
machine: AMD64
processor: Intel64 Family 6 Model 158 Stepping 9, GenuineIntel
byteorder: little
LC_ALL: None
LANG: None
LOCALE: None.None
pandas: 0.20.3
pytest: 3.2.1
pip: 9.0.1
setuptools: 36.5.0.post20170921
Cython: 0.26.1
numpy: 1.13.1
scipy: 0.19.1
xarray: None
IPython: 6.1.0
sphinx: 1.6.3
patsy: 0.4.1
dateutil: 2.6.1
pytz: 2017.2
blosc: None
bottleneck: 1.2.1
tables: 3.4.2
numexpr: 2.6.2
feather: None
matplotlib: 2.0.2
openpyxl: 2.4.8
xlrd: 1.1.0
xlwt: 1.3.0
xlsxwriter: 0.9.8
lxml: 3.8.0
bs4: 4.6.0
html5lib: 0.999999999
sqlalchemy: 1.1.13
pymysql: None
psycopg2: None
jinja2: 2.9.6
s3fs: None
pandas_gbq: None
pandas_datareader: None
The text was updated successfully, but these errors were encountered: