-
-
Notifications
You must be signed in to change notification settings - Fork 1.9k
Non-fancy scattergl to work with dates #1021
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
37847ab
to
2ead564
Compare
2ead564
to
ab4ffbf
Compare
I added tests, including an implementation-dependent way of checking which |
@@ -118,7 +118,7 @@ proto.handlePick = function(pickResult) { | |||
trace: this, | |||
dataCoord: pickResult.dataCoord, | |||
traceCoord: [ | |||
this.pickXData[index], | |||
Number(this.pickXData[index]), // non-fancy scattergl has Dates |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@monfera while pure-js users of plotly.js can use Date
objects, plot.ly (and plotly.js users that may want to serialize their plots) use date strings via Lib.dateTime2ms
(here is what regular cartesian does). I realize that this will be slower, but can we at least support it for compatibility with the rest of our dates?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey Alex, good point. Currently, scatter
works with ISO formatted dates e.g. 2016-10-13
but I'll need to add a conversion from string to date for scattergl
. It'll be done upfront. The line you refer to above probably won't need to change as by the time the tooltip gets data it's already in the standard JS Date
format, but the front-end part needs some logic. To avoid having to loop through a possibly large number of array elements, I'm thinking about just checking the type of the first element of the array; if it's Date
it's fine; if it's a string, then convert all elements to Date
. Is it OK in your opinion and also @etpinard ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It makes me a little wary, but I can't think of a reason users would mix Date
s and date strings (at least within one trace - definitely different traces should be checked independently). The first element may not be enough though, in case it's null
or something, but the first valid element seems OK. Especially since this trace type is specifically built for performance, I guess we can impose some sane restrictions on the users.
I haven't really looked at the architecture here, but if you're already converting the values you can just go straight to milliseconds, ignore Date
objects altogether. Also notice that it's not exactly ISO 8601 we use, I don't know its name but I think of it as SQL date format - ie maximally it's 2016-10-13 12:34:56.789123
but we're permissive about letting you truncate it after any part.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Alex I'd probably reuse the date conversion routine you pointed to above, as if at all possible, I'd shy away from rolling a new string
-> Date
parser. Yes I agree with using epoch milliseconds as a general principle esp. on greenfield projects, although in plotly
it looks like it's not typically done. There's already conversion by default, as WebGL buffers cover float arrays, so it does get converted to epoch ms already.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@monfera Looking good!
Let's see by how much Lib.isDateTime
will slow down scattergl
in fast mode.
@@ -4,6 +4,7 @@ var Plots = require('@src/plots/plots'); | |||
var Lib = require('@src/lib'); | |||
var Color = require('@src/components/color'); | |||
var tinycolor = require('tinycolor2'); | |||
var hasWebGLSupport = require('../assets/has_webgl_support'); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd like to keep axes_test.js
for test cases in relation to src/plots/cartesian/axes.js
only.
@monfera can you move your new test cases to one of the gl2d_
suite of your choice?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
New test are now separated from axes_test
.
expect(gd._fullData[0].type).toBe('scattergl'); | ||
expect(gd._fullData[0]._module.basePlotModule.name).toBe('gl2d'); | ||
|
||
// one way of check which renderer - fancy vs not - we're using |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nice find 👍
@@ -279,7 +279,8 @@ proto.updateFast = function(options) { | |||
yy = y[i]; | |||
|
|||
// check for isNaN is faster but doesn't skip over nulls | |||
if(!isNumeric(xx) || !isNumeric(yy)) continue; | |||
if(!isNumeric(yy)) continue; | |||
if(!isNumeric(xx) && !(xx instanceof Date)) continue; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lib.isDateTime
is what you need here.
I'm afraid it will be significantly slower than xx instanceof Date
though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@etpinard @alexcjohnson I'd just like to clarify something here. Testing with Lib.isDateTime
is useful here as long as we actually do something with the broader set of options, in particular, string representations of dates. So there's got to be logic to check the type and dispatch to either straight-through processing (no conversion, i.e. Date
objects that are handled now) or conversion from the string representations of dates to Date
objects. Should I go ahead and do these things? As mentioned above, I'd keep the fast path fast by testing the first legitimate value only, and if it's suitable for straight-through processing (STP) then the entire array would be assumed to be so. (As @alexcjohnson mentioned I'd check for the first non-null
value).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm a little surprised that epoch milliseconds are accepted as date data right now, I don't think our svg cartesian axes accept that, do they? But anyway I think your plan sounds good. Lib.dateTime2ms
does short-circuit Date
objects, but you could certainly shave off a little extra overhead by not going through it at all, and I think it's fine to assume (at least for these performance-oriented types) that all valid values in a given array have the same type.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@alexcjohnson sorry I conflated two things in the spur of the moment, acceptance of numeric values and the rendering of dates. If I pass on epoch milliseconds it'll render fine but it won't render the numbers as dates; it'll simply show the epoch milliseconds as numbers. I'll revise my comment.
@@ -135,7 +135,7 @@ proto.handlePick = function(pickResult) { | |||
|
|||
// check if trace is fancy | |||
proto.isFancy = function(options) { | |||
if(this.scene.xaxis.type !== 'linear') return true; | |||
if(this.scene.xaxis.type !== 'linear' && this.scene.xaxis.type !== 'date') return true; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
Note to self: make it work without the
|
I'm actually working right now on (for cartesian axes) having ranges (and a few other things that currently work like ranges) use date strings rather than milliseconds. You may want to hold off on that one until mine is done so we don't duplicate efforts. |
@alexcjohnson also the |
Yes. Pretty sure the way I'm doing it (by hooking into the same machinery we use to convert date data) will make that work automatically for cartesian, but in any event I'll make sure it works. |
positions[ptr++] = yy; | ||
if(!fastType) { | ||
xx = Lib.dateTime2ms(xx); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@etpinard @alexcjohnson as you see I ended up not skimping on tests, i.e. not just checking the first value, as we already had a check with relatively fast tests (number or Date
). It's only when this fails is when the slower path is taken. If you think I can optimize away the isDateTime
part, i.e. just assume that once we hit one of those, the rest of them will be like it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It might worth exploring with placing the instanceOf Date
vs ms vs Lib.isDateTime
check as part of the autotype routine here.
Maybe, we could add a Lib.isDateTime
alternative that would track what kind of date time the x
are input at the defaults step and store that datetime type in e.g. fullLayout.xaxis._datetimeType
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
... I think the autotype routine does it right. It samples not-more than 1000 pts (that takes ~ 3 ms
) to determine the axis type.
In the cartesian code, with that axis type, the data-to-calcData routines (i.e. ax.d2c
) simply assume a lone axis type for all items in x
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@etpinard thanks for testing and making a decision over whether this approach is fast enough or not. The current code in master already incurred the expense of two isNumeric
calls per loop execution and admittedly I haven't attempted to make it faster than that. I agree that we can take this Date
rework as an opportunity to speed up the loop a bit. For now, it can only be done probabilistically (for future options, I added a few lines of comment), as you said. In my previous commit I didn't shoot for such speedup, in part because it would be based on a heuristic, and in part because, if we want to increase speed, we might be able to do other things:
- if we want to speed up
plotly.js
time, then permitting a typed array input (= no array copy), or at least letting the user specify the value types can speed things up - if we want to speed it up when called via Python, then either the Python interface could directly convert the date strings into JavaScript
Date
objects (or better, directly populate a typed array), or, the Python -> JS WebSockets interface can directly deserialize into a JS typed array (this of course assumes that Python sends epoch milliseconds).
Back to present time: the fast path can be provided if all values can be assumed to be number (or Date
) rather than a majority, so I ended up not tampering with the autotype
logic (it could be more DRY but wanted to keep loop bodies simple, and direct, for speed).
@monfera nice job 🎉 Using: var x = fill(N, (_, i) => (new Date(2016, 8, i % 31)));
var y = fill(N, () => Math.random());
console.time('gl2d dates')
Plotly.plot(Tabs.fresh(), [{
type: 'scattergl',
mode: 'markers',
x: x,
y: y,
opacity: 0.6
}], {
title: `scattergl date coordinates with ${N} pts`
})
.then(() => console.timeEnd('gl2d dates'));
function fill(N, func) {
var out = new Array(N);
for(var i = 0; i < N; i++) {
out[i] = func(null, i);
}
return out;
} and |
Looks like your last commit (3c79068) adds about It's not obvious to me why. Investigating. |
Ignore my last comment, looks like that ~ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This PR is looking great to me. 💃
Looks like we're now able to plot 1e6
dates in 1.5 to 2 seconds . Nice job 🐎
// (for the future, typed arrays can guarantee it, and Date values can be done with | ||
// representing the epoch milliseconds in a typed array; | ||
// also, perhaps the Python / R interfaces take care of String->Date conversions | ||
// such that there's no need to check for string dates in plotly.js) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the comment.
such that there's no need to check for string dates in plotly.js
That won't happen any time soon unfortunately ...
} | ||
} | ||
|
||
return true; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
very nicely done.
Performance improvement - it avoids the slower fancy
scattergl
that used to be triggered by the presence of theDate
X axis type.Fixes issue #413