-
-
Notifications
You must be signed in to change notification settings - Fork 1.9k
Optimize performance of setCategoryIndex #1544
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
I added an auxiliary map for ``ax._categories`` array to avoid using ``Array.prototype.indexOf`` function searching for the existence of a category. I tested [here](https://jsfiddle.net/smileyhaowen/cvLzxz7L/2/) and observed ~20% performance improvements on big amount of traces.
@etpinard Ok I will take a look. |
and also |
@etpinard It relates to |
@etpinard Is it okay to use the map object in ES6 in Plotly.js? The ES5 map does not allow non-string keys, which failed some cases. |
Unfortunately, no. plotly.js will remain an ES5-only project for the foreseeable future. |
Got it. I will see if I could resolve the breaks in some other ways. Thanks! |
@etpinard Had 1 last jasmine test case failed: https://github.com/plotly/plotly.js/blob/master/test/jasmine/tests/axes_test.js#L1853-L1877. What is this feature about? My error is:
|
Haha that test is my doing, it's testing a collection of edge cases with the ticktext/tickvals feature, which lets you simultaneously override both the positions and labels of tick marks. The normal usage is stuff like https://plot.ly/javascript/axes/#enumerated-ticks-with-tickvals-and-ticktext In the category case, And then after you provide a tick value, you may choose to override its text with your own string, but if you don't override it the automatic label is used (in this case the category string). So in this case the problem is either actually a real regression the test is picking up, or the Plotly.newPlot(gd, [{x: ['a', 'b', 'c', 'd'], y: [1, 2, 3, 4]}],
{xaxis: {
range: [-0.5, 4.5],
tickvals: ['a', 1, 1.5, 'c', 2.7, 3, 'e', 4, 5, -2],
ticktext: ['A!', 'B?', 'B->C']
}}); (the rest of |
Construct a map rather than an array to improve the performance of index search. Jasmine tests passed.
Ah got it. I did not notice the rendering result. Will doublecheck. Thanks @alexcjohnson ! |
Now the tests passed. |
Ah yes, I've seen that test fail intermittently. You should be able to retry the test by clicking "rebuild" on the test results page, but I'm glad you got this to work anyway. I'm away from my computer but will review this tomorrow night. |
(I think you might need to have permissions and be logged in in order to retry.) |
@rreusser you are right, the button is disabled: |
Ah sorry, I thought I was looking at it signed-out on my phone... well, one more reason for us to clamp down on intermittent test failures! |
src/plots/cartesian/set_convert.js
Outdated
} else { | ||
index = ax._categories.indexOf(v); | ||
if(index !== -1) return index; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If this block is only here to fix that test, and never gets hit in real code (which looks to be the case to me, since ax._categories
and ax._categoriesMap
always get populated simultaneously) then we should leave out the else
block here, and just make the test more realistic by creating the corresponding _categoriesMap
in the mock ax
object.
For the record, this made me curious "do we still need ax._categories
at all?" But you were right to not remove it, as getCategoryName
(ax.c2d
etc, ie hover info) still needs the reverse mapping to be fast.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Totally agree. I will make the change on the mock object instead.
Yeah I tried to completely replacing the _categories
array with the map, but there will be some cases we need a reverse mapping.
Actually, comparing to maintaining a pair of synced-up array and map, I feel there must be some better data structure for a performant categories collection. But I was sort of lazy and did not give it another think anymore.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I feel there must be some better data structure for a performant categories collection.
Good point, I'm sure there is, though it won't be built in so is unlikely to improve performance, it would just clean up the API... lets not worry about it for now, this two-part solution is light and easy enough to use here but we can look into it further if this comes up in enough other places.
… mock object in unit test instead
src/plots/cartesian/set_convert.js
Outdated
@@ -152,10 +152,8 @@ module.exports = function setConvert(ax, fullLayout) { | |||
if(ax._categoriesMap) { | |||
index = ax._categoriesMap[v]; | |||
if(index !== undefined) return index; | |||
} else { | |||
index = ax._categories.indexOf(v); | |||
if(index !== -1) return index; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Awesome, I'm glad it was as easy as that :)
Apologies for the iterations, but one more related request since this is a perf PR of a couple of very hot code paths: I don't think that either getCategoryIndex
or setCategoryIndex
needs the fallbacks for missing _categoriesMap
- the previous version, using only _categories
, didn't have these fallbacks, and every path that creates _categories
also creates _categoriesMap
, right?
(also 📚 your comment above this is out of date now)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right now no functions will fallback to use indexOf
on _categories
anymore.
Currently I made sure _categoriesMap
is always populated accordingly when _categories
is created or initialized with some items for the current code. But this is not a robust way. Someone could change _categories
without updating _categoriesMap
in their code, and it will not be trivial for them to realize they need to update _categoriesMap
.
I have code with fallbacks in stash. If you think that's a better way I can quickly change it back.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh sorry, I just meant we can remove the fallbacks for if _categoriesMap
doesn't even exist, ie in getCategoryIndex
changing:
if(ax._categoriesMap) {
var index = ax._categoriesMap[v];
if(index !== undefined) return index;
}
to just:
var index = ax._categoriesMap[v];
if(index !== undefined) return index;
and removing from setCategoryIndex
:
if(ax._categoriesMap === undefined) {
ax._categoriesMap = {};
}
As far as I can tell, that much should be robust. Not a huge deal but should be a small performance boost.
But you raise a good point about future changes, I guess that's an argument for converting these two to a single bidirectional map object even if it doesn't yield any additional perf benefits.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh I get it. Sorry I misunderstood your point.
Well, I feel a liiiittle bit uncomfortable of doing that, because if there is any case that the map is not created, it will result in an uncaught exception, which to me, seems more severe than a messed up chart.
And I tried on jsperf, the overhead of checking null is less than 10% on Chrome:
And considering the performance of reading a map itself, 10% differences here will not be a big cost for the whole workflow.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's fine, I'm pretty confident that it's OK since that's how it worked previously, but I can take responsibility for removing it (in another PR) and ensuring we have sufficient test coverage. You're right that the penalty is much smaller than the improvement you've made here (though I'd be wary of that jsperf result, I suspect in Safari the compiler has optimized away almost everything you see there).
So I think this is ready to go! Nice job @hy9be !
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good to know! I used to trust the results I got from jsperf a lot...
Thanks a lot for reviewing my code! @alexcjohnson
I added an auxiliary map for
ax._categories
array to avoid usingArray.prototype.indexOf
function searching for the existence of a category.I tested here and observed ~20% performance improvements on big amount of traces.
The unit test failed at some range_slider cases but I cannot figure out how my changes are related to those test cases.