Skip to content

Optimize performance of setCategoryIndex #1544

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 13 commits into from
Apr 10, 2017

Conversation

hy9be
Copy link
Contributor

@hy9be hy9be commented Mar 31, 2017

I added an auxiliary map for ax._categories array to avoid using Array.prototype.indexOf function searching for the existence of a category.

I tested here and observed ~20% performance improvements on big amount of traces.

The unit test failed at some range_slider cases but I cannot figure out how my changes are related to those test cases.

hy9be added 2 commits March 31, 2017 16:56
I added an auxiliary map for ``ax._categories`` array to avoid using ``Array.prototype.indexOf`` function searching for the existence of a category.

I tested [here](https://jsfiddle.net/smileyhaowen/cvLzxz7L/2/) and observed ~20% performance improvements on big amount of traces.
@etpinard
Copy link
Contributor

Thanks very much for the PR @hy9be 🍻

Looks like the image mocks testing axis categoryarray / categoryorder are failing

image


To run those tests locally, have a look at image test README.

@hy9be
Copy link
Contributor Author

hy9be commented Mar 31, 2017

@etpinard Ok I will take a look.

@etpinard
Copy link
Contributor

and also npm run test-jasmine -- calcdata is failing (which should be easier to debug than the image tests).

@hy9be
Copy link
Contributor Author

hy9be commented Mar 31, 2017

@etpinard It relates to cateoryorder cases. I will need to take a further look.

@hy9be
Copy link
Contributor Author

hy9be commented Apr 3, 2017

@etpinard Is it okay to use the map object in ES6 in Plotly.js? The ES5 map does not allow non-string keys, which failed some cases.

@etpinard
Copy link
Contributor

etpinard commented Apr 3, 2017

Unfortunately, no. plotly.js will remain an ES5-only project for the foreseeable future.

@hy9be
Copy link
Contributor Author

hy9be commented Apr 3, 2017

Got it. I will see if I could resolve the breaks in some other ways. Thanks!

@hy9be
Copy link
Contributor Author

hy9be commented Apr 3, 2017

@etpinard Had 1 last jasmine test case failed: https://github.com/plotly/plotly.js/blob/master/test/jasmine/tests/axes_test.js#L1853-L1877. What is this feature about?

My error is:

Expected [ 'B?', 'B->C', 'd', 'd', '' ] to equal [ 'A!', 'B?', 'B->C', 'c', 'd', 'd', '' ].
	    at Object.<anonymous> (/var/folders/4z/81cn93jd36q64qf2lbd48bx0lx05kn/T/tests/axes_test.js:1707:0 <- /var/folders/4z/81cn93jd36q64qf2lbd48bx0lx05kn/T/68c4de161f2f1ec90f156c4f21faa6b8.browserify:195317:29)
Expected 'a' to be 'A!'.
	    at Object.<anonymous> (/var/folders/4z/81cn93jd36q64qf2lbd48bx0lx05kn/T/tests/axes_test.js:1708:0 <- /var/folders/4z/81cn93jd36q64qf2lbd48bx0lx05kn/T/68c4de161f2f1ec90f156c4f21faa6b8.browserify:195318:42)

@alexcjohnson
Copy link
Collaborator

https://github.com/plotly/plotly.js/blob/master/test/jasmine/tests/axes_test.js#L1853-L1877. What is this feature about?

Haha that test is my doing, it's testing a collection of edge cases with the ticktext/tickvals feature, which lets you simultaneously override both the positions and labels of tick marks. The normal usage is stuff like https://plot.ly/javascript/axes/#enumerated-ticks-with-tickvals-and-ticktext

In the category case, tickvals entries can either be category strings, or numbers corresponding to the serial numbers assigned to the categories. The reason we need to allow numbers is they can also be fractional values if you want to put a tick in between categories or something. For example, maybe you want to label two neighboring categories with one tick halfway between them. There would be no way to do this using the category string.

And then after you provide a tick value, you may choose to override its text with your own string, but if you don't override it the automatic label is used (in this case the category string).

So in this case the problem is either actually a real regression the test is picking up, or the mockCalc function in this test is incompatible with the changes you've made and needs updating. The way to tell the difference would be to create the plot that this test is trying to mock, and see if it behaves correctly. In this case I'd do:

Plotly.newPlot(gd, [{x: ['a', 'b', 'c', 'd'], y: [1, 2, 3, 4]}],
    {xaxis: {
        range: [-0.5, 4.5],
        tickvals: ['a', 1, 1.5, 'c', 2.7, 3, 'e', 4, 5, -2],
        ticktext: ['A!', 'B?', 'B->C']
    }});

(the rest of ax is probably included just to get the right pieces in place since the test doesn't call supplyDefaults)
If I do that on the master branch I see:
screen shot 2017-04-05 at 11 54 09 am
which you can see gives the expected tick labels [ 'A!', 'B?', 'B->C', 'c', 'd', 'd', '' ]. What do you see on your branch?

Construct a map rather than an array to improve the performance of index search. Jasmine tests passed.
@hy9be
Copy link
Contributor Author

hy9be commented Apr 5, 2017

Ah got it. I did not notice the rendering result. Will doublecheck. Thanks @alexcjohnson !

@hy9be
Copy link
Contributor Author

hy9be commented Apr 8, 2017

CI reported error:
image

But in my local environment the test suites always pass:
image

And the only failed cases locally are the noCI gl3d cases:
image
image

@hy9be
Copy link
Contributor Author

hy9be commented Apr 8, 2017

Now the tests passed.
Btw is there a way to re-run the CI test? Every time I had to make some dummy commits now...

@alexcjohnson
Copy link
Collaborator

Ah yes, I've seen that test fail intermittently. You should be able to retry the test by clicking "rebuild" on the test results page, but I'm glad you got this to work anyway. I'm away from my computer but will review this tomorrow night.

@rreusser
Copy link
Contributor

rreusser commented Apr 8, 2017

(I think you might need to have permissions and be logged in in order to retry.)

@hy9be
Copy link
Contributor Author

hy9be commented Apr 9, 2017

@rreusser you are right, the button is disabled:

image

@alexcjohnson
Copy link
Collaborator

you are right, the button is disabled:

Ah sorry, I thought I was looking at it signed-out on my phone... well, one more reason for us to clamp down on intermittent test failures!

} else {
index = ax._categories.indexOf(v);
if(index !== -1) return index;
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this block is only here to fix that test, and never gets hit in real code (which looks to be the case to me, since ax._categories and ax._categoriesMap always get populated simultaneously) then we should leave out the else block here, and just make the test more realistic by creating the corresponding _categoriesMap in the mock ax object.

For the record, this made me curious "do we still need ax._categories at all?" But you were right to not remove it, as getCategoryName (ax.c2d etc, ie hover info) still needs the reverse mapping to be fast.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Totally agree. I will make the change on the mock object instead.

Yeah I tried to completely replacing the _categories array with the map, but there will be some cases we need a reverse mapping.

Actually, comparing to maintaining a pair of synced-up array and map, I feel there must be some better data structure for a performant categories collection. But I was sort of lazy and did not give it another think anymore.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel there must be some better data structure for a performant categories collection.

Good point, I'm sure there is, though it won't be built in so is unlikely to improve performance, it would just clean up the API... lets not worry about it for now, this two-part solution is light and easy enough to use here but we can look into it further if this comes up in enough other places.

@@ -152,10 +152,8 @@ module.exports = function setConvert(ax, fullLayout) {
if(ax._categoriesMap) {
index = ax._categoriesMap[v];
if(index !== undefined) return index;
} else {
index = ax._categories.indexOf(v);
if(index !== -1) return index;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome, I'm glad it was as easy as that :)

Apologies for the iterations, but one more related request since this is a perf PR of a couple of very hot code paths: I don't think that either getCategoryIndex or setCategoryIndex needs the fallbacks for missing _categoriesMap - the previous version, using only _categories, didn't have these fallbacks, and every path that creates _categories also creates _categoriesMap, right?

(also 📚 your comment above this is out of date now)

Copy link
Contributor Author

@hy9be hy9be Apr 10, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right now no functions will fallback to use indexOf on _categories anymore.

Currently I made sure _categoriesMap is always populated accordingly when _categories is created or initialized with some items for the current code. But this is not a robust way. Someone could change _categories without updating _categoriesMap in their code, and it will not be trivial for them to realize they need to update _categoriesMap.

I have code with fallbacks in stash. If you think that's a better way I can quickly change it back.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh sorry, I just meant we can remove the fallbacks for if _categoriesMap doesn't even exist, ie in getCategoryIndex changing:

if(ax._categoriesMap) {
    var index = ax._categoriesMap[v];
    if(index !== undefined) return index;
}

to just:

var index = ax._categoriesMap[v];
if(index !== undefined) return index;

and removing from setCategoryIndex:

if(ax._categoriesMap === undefined) {
    ax._categoriesMap = {};
}

As far as I can tell, that much should be robust. Not a huge deal but should be a small performance boost.

But you raise a good point about future changes, I guess that's an argument for converting these two to a single bidirectional map object even if it doesn't yield any additional perf benefits.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh I get it. Sorry I misunderstood your point.

Well, I feel a liiiittle bit uncomfortable of doing that, because if there is any case that the map is not created, it will result in an uncaught exception, which to me, seems more severe than a messed up chart.

And I tried on jsperf, the overhead of checking null is less than 10% on Chrome:
image
And considering the performance of reading a map itself, 10% differences here will not be a big cost for the whole workflow.

For Safari the differences are even less.
image

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's fine, I'm pretty confident that it's OK since that's how it worked previously, but I can take responsibility for removing it (in another PR) and ensuring we have sufficient test coverage. You're right that the penalty is much smaller than the improvement you've made here (though I'd be wary of that jsperf result, I suspect in Safari the compiler has optimized away almost everything you see there).

So I think this is ready to go! Nice job @hy9be !

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good to know! I used to trust the results I got from jsperf a lot...
Thanks a lot for reviewing my code! @alexcjohnson

@alexcjohnson alexcjohnson merged commit 630816f into plotly:master Apr 10, 2017
@etpinard etpinard added this to the v1.26.0 milestone Apr 11, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants