-
-
Notifications
You must be signed in to change notification settings - Fork 18.5k
Pandas attempts to convert some strings to timestamps when grouping by a timestamp and aggregating? #10078
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
version info: INSTALLED VERSIONScommit: None pandas: 0.15.1 |
rather than rolling your own, you prob want this.
IIRC there is another issue about this outstanding. This inference is quite tricky and certainly could be a bit buggy. Want to step thru things and see if you can narrow down whats going on? |
Good suggestion! Still, note how that returns a multiIndex? If you change to x.mode().iloc[0] so that you have a straight Timestamp index, you get back to the same weird result. I'll dig as time allows. If you can remember the other open issue, please post. I was struggling to describe this, so it was tricky to search for open issues/SO posts/etc... Thanks! |
Looks like this work correctly on master. Suppose this could use a test.
|
I am working through logs of web requests, and when I want to find the most common, say, user agent string for a (disguised) user, I run something like the following:
Note that in this (admittedly unusual) example, all of the lines are identical. I'm not sure if that is necessary to recreate the issue. And, I'm obscuring the exact purpose of this code, but it reproduces the bug: The 'userId' comes back as a Timestamp, not a string. This happens after the function most_common_values returns, since that userId string is not returned as a timestamp. if we change the value of the userId to an int:
or if the value of the associated integer is small enough:
then the results are what we'd expect (the most common value as its original type is returned.)
I imagine that for some reason something like a dateutil parser is being called on strings by default but that probably shoulnd't be happening...
The text was updated successfully, but these errors were encountered: