-
-
Notifications
You must be signed in to change notification settings - Fork 11k
BUG: cast to str_ should not convert to pure-python intermediate #9978
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
46838e9
to
703cd0b
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. I thought I already did a search for this kind of problem, but I guess @name@_getitem
slipped through the cracks.
Test fails due to some platforms not having |
703cd0b
to
4a1a2b4
Compare
Oh right.. then I will just remove the f16 test. |
# str(np.float64(...).item()) as this would incorrectly truncate. | ||
a = np.zeros(1, dtype='S20') | ||
a[:] = np.array(['1.12345678901234567890'], dtype='f8') | ||
assert_equal(a[0], b"1.1234567890123457") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you add another test for S10
too?
It would be good to record what happens when the float str is too long. I'm not sure whether letting the string truncate is OK, or if it should round.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also might be tidier to use a 0d array here and .astype
, but up to you
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For all types (not just floats) there is a fairly well established precedent that strings get simply truncated if the dtype is too short, so I don't think we should round. Integers don't round.
Also, I'm a bit hesitant to add a test for S10
because I didn't touch any string truncation code here, and getting the truncation right seems like a separate issue, to be examined and discussed on its own. This PR's purpose is to test string casting intermediates, not to test the arguably unrelated string truncation.
Thanks Allan. |
Small fix for casting to string arrays. See See pandas-dev/pandas#18123.
When casting to a
string
(S
) array numpy currently doesdst[i] = str(src[i].item())
where recall that.item()
converts to a pure-python type. I think it should just dodst[i] = str(src[i])
.This comes up for floating types now that we updated their
str
, since nowstr(np.floatXX(...))
is not the same asstr(np.floatXX(...).item()
, especially in python2.