-
-
Notifications
You must be signed in to change notification settings - Fork 18.5k
to_hdf writes data that doesn't match read back #7605
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
pls show pd.show_versions() and df.info() you have odd dtypes - that needs to be fixed in order to serialize properly |
|
any unicode or actual objects? (what I mean are the object fields just straight strings? |
the strings are unicode (they're latin american names). I suspect we're onto something here -- I've been reviewing fields that turn up blank, and they've got some weird chars. Where did they come from? We've cleaned that field upstream, but maybe it got uncleaned. I'll clean to simple old-school ASCII and report back. |
might be related to #7244 their is a try except around a fast path,which if it fails hits the slower path if I want to debug would be great! |
progress:
@jreback I'm closing this now, I think you've nailed it. Very much appreciated! |
gr8 feel free to comment/update that other one when I have time! |
here's the code:
and the traceback:
I've been trying to figure out why upstream fixes didn't seem to appear downstream. I finally came here: apparently to_hdf is writing a file that's different when it's read back. As I've been re-running this over the last hour or so, different fields have come up in the AssertionError.
Here are a few things that do not eliminate the error: with or without compression; format table or fixed. However, changing these arguments does change which field is identified by assert_frame_equal as unequal.
I have no idea how to reproduce this without my entire dataset, which is unfortunately confidential. I'll fall back to csv for now, and I hope that I'm just doing something horribly dumb that we can fix.
The text was updated successfully, but these errors were encountered: