Skip to content

cube.aggregated_by and multidimensional auxcoords #3174

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

rcomer
Copy link
Member

@rcomer rcomer commented Sep 17, 2018

I had a go at fixing My First Issue #1530.

It's not quite there yet - see inline comments and failing new test.

for i in key_slice])
new_points.append(new_pt)
new_pts = np.apply_along_axis(
'|'.join, dim, coord.points.take(key_slice, dim))
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't quite work (and is the reason for the new test failing), as the longer resulting strings get truncated to match the shortest. E.g:

import numpy as np

a = np.arange(25).reshape(5, 5)
stra = a.astype(np.dtype('U'))

print(stra[:, 2:4])

print(np.apply_along_axis('|'.join, 1, stra.take([2, 3], 1)))

gives

[['2' '3']
 ['7' '8']
 ['12' '13']
 ['17' '18']
 ['22' '23']]
['2|3' '7|8' '12|' '17|' '22|']

Any suggestions how to fix this? I'm sure I could come up with a loop, but something slicker would be nice.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I posted this on StackOverflow.
https://stackoverflow.com/questions/52677947

Based on the linked numpy GItHub issue I have a second solution. So I think the options are either a helper function like join_along_axis in my SO OP, or something along the lines of the answer I posted. Any preferences?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Timings:
With join_along_axis timeit gives me:
500 loops, best of 3: 77 usec per loop

With the second solution:
500 loops, best of 3: 225 usec per loop

The second solution with the dtype calculation outside the loop:
500 loops, best of 3: 139 usec per loop

So I am very much leaning to using join_along_axis, as I think it also wins on readability.

@rcomer
Copy link
Member Author

rcomer commented Sep 17, 2018

Somewhat belatedly realised that noone could see my inline comments until I hit “submit review” 😳

@DPeterK
Copy link
Member

DPeterK commented Sep 24, 2018

noone could see my inline comments until I hit “submit review"

Not to worry – we all do this all the time too! I'll see if I can take a look at this sometime this week...

@rcomer rcomer force-pushed the aggregate_by-fix-multidim-auxcoords branch from 9b59fd3 to b862cd7 Compare October 15, 2018 11:01
@rcomer
Copy link
Member Author

rcomer commented Oct 15, 2018

Multidimensional string coord test now fixed. For my rationale on the method, see the outdated diff comments for a conversation I had with myself! 😆

@corinnebosley
Copy link
Member

@rcomer Awesome! I'm afraid @dkillick is not in this week to continue his review, but @kaedonkers will be taking a look instead this very afternoon.

@rcomer
Copy link
Member Author

rcomer commented Oct 15, 2018

Thanks @corinnebosley - I am on leave for two weeks from Wednesday so, unless the review changes are very minor, they won't get done till November anyway.

Copy link
Member

@corinnebosley corinnebosley left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for all this effort @rcomer, I don't know how you find the time. I've added a few comments to your PR, mostly just aesthetic changes though. Functionally, this looks sound to me.

@rcomer
Copy link
Member Author

rcomer commented Nov 2, 2018

Thanks @corinnebosley and @QuLogic for your feedback. I think I've covered everything except I've not changed the "simple" names in the tests (though happy to do so given a better suggestion).

@rcomer
Copy link
Member Author

rcomer commented Nov 2, 2018

I've had a look through the test failures, and they don't seem to be related to aggregated_by. I'm quite far behind the master branch now - is a rebase likely to help?

Edit: have rebased as it can't do any harm!

@rcomer rcomer force-pushed the aggregate_by-fix-multidim-auxcoords branch from f952caa to 4540965 Compare November 2, 2018 14:27
@rcomer
Copy link
Member Author

rcomer commented Nov 2, 2018

Rebase didn't help. 😞

@rcomer
Copy link
Member Author

rcomer commented Nov 3, 2018

Have updated the whatsnew contributions directory name, as this is clearly not going into v2.2!

Looking more closely at the Travis failures, although they are popping up all over the place they all seem to come down datetime/calendar issues.

@corinnebosley
Copy link
Member

@rcomer Yeah apologies about that, the failures are to do with some dependencies in the testing environment. I'll respin the tests once we've fixed the dependency issues.

@rcomer rcomer force-pushed the aggregate_by-fix-multidim-auxcoords branch from 3f68631 to 51618ce Compare November 7, 2018 11:47
@rcomer
Copy link
Member Author

rcomer commented Nov 7, 2018

Hi @corinnebosley, I just noticed your latest PR so have rebased again. 🤞

@corinnebosley
Copy link
Member

@rcomer Wow, you don't hang about do you? I'll ping you with some details later on.

@corinnebosley
Copy link
Member

Nice work @rcomer! Thanks for the effort.

@corinnebosley corinnebosley merged commit 42baa80 into SciTools:master Nov 7, 2018
@rcomer rcomer deleted the aggregate_by-fix-multidim-auxcoords branch December 7, 2018 16:41
@rcomer rcomer added this to the v2.3.0 milestone Oct 23, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants