Non-fancy scattergl to work with dates #1021

monfera · 2016-10-07T14:25:22Z

Performance improvement - it avoids the slower fancy scattergl that used to be triggered by the presence of the Date X axis type.
Fixes issue #413

monfera · 2016-10-11T07:42:45Z

I added tests, including an implementation-dependent way of checking which gl renderer (fancy vs fast) is being used; interested in a review.

alexcjohnson · 2016-10-13T01:28:49Z

src/traces/scattergl/convert.js

@@ -118,7 +118,7 @@ proto.handlePick = function(pickResult) {
        trace: this,
        dataCoord: pickResult.dataCoord,
        traceCoord: [
-            this.pickXData[index],
+            Number(this.pickXData[index]), // non-fancy scattergl has Dates


@monfera while pure-js users of plotly.js can use Date objects, plot.ly (and plotly.js users that may want to serialize their plots) use date strings via Lib.dateTime2ms (here is what regular cartesian does). I realize that this will be slower, but can we at least support it for compatibility with the rest of our dates?

Hey Alex, good point. Currently, scatter works with ISO formatted dates e.g. 2016-10-13 but I'll need to add a conversion from string to date for scattergl. It'll be done upfront. The line you refer to above probably won't need to change as by the time the tooltip gets data it's already in the standard JS Date format, but the front-end part needs some logic. To avoid having to loop through a possibly large number of array elements, I'm thinking about just checking the type of the first element of the array; if it's Date it's fine; if it's a string, then convert all elements to Date. Is it OK in your opinion and also @etpinard ?

It makes me a little wary, but I can't think of a reason users would mix Dates and date strings (at least within one trace - definitely different traces should be checked independently). The first element may not be enough though, in case it's null or something, but the first valid element seems OK. Especially since this trace type is specifically built for performance, I guess we can impose some sane restrictions on the users.

I haven't really looked at the architecture here, but if you're already converting the values you can just go straight to milliseconds, ignore Date objects altogether. Also notice that it's not exactly ISO 8601 we use, I don't know its name but I think of it as SQL date format - ie maximally it's 2016-10-13 12:34:56.789123 but we're permissive about letting you truncate it after any part.

Alex I'd probably reuse the date conversion routine you pointed to above, as if at all possible, I'd shy away from rolling a new string -> Date parser. Yes I agree with using epoch milliseconds as a general principle esp. on greenfield projects, although in plotly it looks like it's not typically done. There's already conversion by default, as WebGL buffers cover float arrays, so it does get converted to epoch ms already.

etpinard

@monfera Looking good!

Let's see by how much Lib.isDateTime will slow down scattergl in fast mode.

etpinard · 2016-10-13T16:38:46Z

test/jasmine/tests/axes_test.js

@@ -4,6 +4,7 @@ var Plots = require('@src/plots/plots');
 var Lib = require('@src/lib');
 var Color = require('@src/components/color');
 var tinycolor = require('tinycolor2');
+var hasWebGLSupport = require('../assets/has_webgl_support');


I'd like to keep axes_test.js for test cases in relation to src/plots/cartesian/axes.js only.

@monfera can you move your new test cases to one of the gl2d_ suite of your choice?

New test are now separated from axes_test.

etpinard · 2016-10-13T16:40:15Z

test/jasmine/tests/axes_test.js

+            expect(gd._fullData[0].type).toBe('scattergl');
+            expect(gd._fullData[0]._module.basePlotModule.name).toBe('gl2d');
+
+            // one way of check which renderer - fancy vs not - we're using


nice find 👍

etpinard · 2016-10-13T16:44:21Z

src/traces/scattergl/convert.js

@@ -279,7 +279,8 @@ proto.updateFast = function(options) {
        yy = y[i];

        // check for isNaN is faster but doesn't skip over nulls
-        if(!isNumeric(xx) || !isNumeric(yy)) continue;
+        if(!isNumeric(yy)) continue;
+        if(!isNumeric(xx) && !(xx instanceof Date)) continue;


Lib.isDateTime is what you need here.

I'm afraid it will be significantly slower than xx instanceof Date though.

@etpinard @alexcjohnson I'd just like to clarify something here. Testing with Lib.isDateTime is useful here as long as we actually do something with the broader set of options, in particular, string representations of dates. So there's got to be logic to check the type and dispatch to either straight-through processing (no conversion, i.e. Date objects that are handled now) or conversion from the string representations of dates to Date objects. Should I go ahead and do these things? As mentioned above, I'd keep the fast path fast by testing the first legitimate value only, and if it's suitable for straight-through processing (STP) then the entire array would be assumed to be so. (As @alexcjohnson mentioned I'd check for the first non-null value).

I'm a little surprised that epoch milliseconds are accepted as date data right now, I don't think our svg cartesian axes accept that, do they? But anyway I think your plan sounds good. Lib.dateTime2ms does short-circuit Date objects, but you could certainly shave off a little extra overhead by not going through it at all, and I think it's fine to assume (at least for these performance-oriented types) that all valid values in a given array have the same type.

@alexcjohnson sorry I conflated two things in the spur of the moment, acceptance of numeric values and the rendering of dates. If I pass on epoch milliseconds it'll render fine but it won't render the numbers as dates; it'll simply show the epoch milliseconds as numbers. I'll revise my comment.

etpinard · 2016-10-13T16:44:30Z

src/traces/scattergl/convert.js

@@ -135,7 +135,7 @@ proto.handlePick = function(pickResult) {

 // check if trace is fancy
 proto.isFancy = function(options) {
-    if(this.scene.xaxis.type !== 'linear') return true;
+    if(this.scene.xaxis.type !== 'linear' && this.scene.xaxis.type !== 'date') return true;


monfera · 2016-10-13T18:46:44Z

Note to self: make it work without the valueOf() calls

            'xaxis': {
                'range': [
                    new Date(2016, 0, 1, 2, 0, 0, 0).valueOf(),
                    new Date(2016, 0, 1, 2, 0, 0, 11).valueOf()
                ],
                'autorange': false
            }

alexcjohnson · 2016-10-13T19:33:00Z

Note to self: make it work without the valueOf() calls

I'm actually working right now on (for cartesian axes) having ranges (and a few other things that currently work like ranges) use date strings rather than milliseconds. You may want to hold off on that one until mine is done so we don't duplicate efforts.

monfera · 2016-10-13T19:40:53Z

@alexcjohnson also the Date type? I don't know if it's desired, just a thought.

alexcjohnson · 2016-10-13T19:57:25Z

also the Date type?

Yes. Pretty sure the way I'm doing it (by hooking into the same machinery we use to convert date data) will make that work automatically for cartesian, but in any event I'll make sure it works.

…ttergl

monfera · 2016-10-19T19:29:50Z

src/traces/scattergl/convert.js

-        positions[ptr++] = yy;
+            if(!fastType) {
+                xx = Lib.dateTime2ms(xx);
+            }


@etpinard @alexcjohnson as you see I ended up not skimping on tests, i.e. not just checking the first value, as we already had a check with relatively fast tests (number or Date). It's only when this fails is when the slower path is taken. If you think I can optimize away the isDateTime part, i.e. just assume that once we hit one of those, the rest of them will be like it.

It might worth exploring with placing the instanceOf Date vs ms vs Lib.isDateTime check as part of the autotype routine here.

Maybe, we could add a Lib.isDateTime alternative that would track what kind of date time the x are input at the defaults step and store that datetime type in e.g. fullLayout.xaxis._datetimeType ?

... I think the autotype routine does it right. It samples not-more than 1000 pts (that takes ~ 3 ms) to determine the axis type.

In the cartesian code, with that axis type, the data-to-calcData routines (i.e. ax.d2c) simply assume a lone axis type for all items in x.

@etpinard thanks for testing and making a decision over whether this approach is fast enough or not. The current code in master already incurred the expense of two isNumeric calls per loop execution and admittedly I haven't attempted to make it faster than that. I agree that we can take this Date rework as an opportunity to speed up the loop a bit. For now, it can only be done probabilistically (for future options, I added a few lines of comment), as you said. In my previous commit I didn't shoot for such speedup, in part because it would be based on a heuristic, and in part because, if we want to increase speed, we might be able to do other things:

if we want to speed up plotly.js time, then permitting a typed array input (= no array copy), or at least letting the user specify the value types can speed things up

if we want to speed it up when called via Python, then either the Python interface could directly convert the date strings into JavaScript Date objects (or better, directly populate a typed array), or, the Python -> JS WebSockets interface can directly deserialize into a JS typed array (this of course assumes that Python sends epoch milliseconds).

Back to present time: the fast path can be provided if all values can be assumed to be number (or Date) rather than a majority, so I ended up not tampering with the autotype logic (it could be more DRY but wanted to keep loop bodies simple, and direct, for speed).

etpinard · 2016-10-19T22:10:04Z

@monfera nice job 🎉

Using:

var x = fill(N, (_, i) => (new Date(2016, 8, i % 31)));
var y = fill(N, () => Math.random());

console.time('gl2d dates')
Plotly.plot(Tabs.fresh(), [{
  type: 'scattergl',
  mode: 'markers',
  x: x,
  y: y,
  opacity: 0.6
}], {
 title: `scattergl date coordinates with ${N} pts`
})
.then(() => console.timeEnd('gl2d dates'));

function fill(N, func) {
  var out = new Array(N);
  for(var i = 0; i < N; i++) {
    out[i] = func(null, i);
  }
  return out;
}

and N = 1e6, Plotly.plot clocks in at ~ 2300 ms. At N = 2e5, we get ~ 800ms. That's roughly a factor of 2 greater than with numeric x/y coordinates.

etpinard · 2016-10-19T22:16:52Z

Looks like your last commit (3c79068) adds about 500ms at N = 1e6 - which is pretty significant.

It's not obvious to me why. Investigating.

etpinard · 2016-10-19T22:23:55Z

Ignore my last comment, looks like that ~ 500ms at N = 1e6 is part of the run-to-run spread.

…loop

… loop

monfera · 2016-10-20T09:36:22Z

@etpinard the last two commits made the perf discussion "outdated" in the eyes of github; these commits and this is my answer to your suggestions.

etpinard

This PR is looking great to me. 💃

Looks like we're now able to plot 1e6 dates in 1.5 to 2 seconds . Nice job 🐎

etpinard · 2016-10-20T14:19:54Z

src/traces/scattergl/convert.js

+//     (for the future, typed arrays can guarantee it, and Date values can be done with
+//      representing the epoch milliseconds in a typed array;
+//      also, perhaps the Python / R interfaces take care of String->Date conversions
+//      such that there's no need to check for string dates in plotly.js)


Thanks for the comment.

such that there's no need to check for string dates in plotly.js

That won't happen any time soon unfortunately ...

etpinard · 2016-10-20T14:21:25Z

src/traces/scattergl/convert.js

+        }
+    }
+
+    return true;


very nicely done.

monfera added 2 commits October 7, 2016 14:34

Admit date X axes for fast (non-fancy) scattergl

fac70b2

Admit date X axes for fast (non-fancy) scattergl hover tooltip too

a23756d

etpinard added bug something broken status: in progress labels Oct 7, 2016

monfera mentioned this pull request Oct 7, 2016

Subpar scattergl performance with date axis #413

Closed

monfera force-pushed the date-scattergl branch from 37847ab to 2ead564 Compare October 10, 2016 19:01

adding tests for fancy vs non-fancy scattergl

ab4ffbf

monfera force-pushed the date-scattergl branch from 2ead564 to ab4ffbf Compare October 10, 2016 20:22

alexcjohnson reviewed Oct 13, 2016

View reviewed changes

etpinard added this to the v1.19.0 milestone Oct 13, 2016

etpinard suggested changes Oct 13, 2016

View reviewed changes

monfera added 2 commits October 17, 2016 15:53

separate the new date rendering tests from axes_test.js

f5d6113

Admit isDateTime === true values (e.g. date strings) in non-fancy sca…

3c79068

…ttergl

monfera commented Oct 19, 2016

View reviewed changes

monfera added 2 commits October 20, 2016 10:57

doing cheaper, probabilistic checks outside the scattergl conversion …

5b71b47

…loop

minor speedup - avoid some boolean checks in the scattergl conversion…

46a6b6c

… loop

etpinard reviewed Oct 20, 2016

View reviewed changes

etpinard added status: reviewable and removed status: in progress labels Oct 20, 2016

etpinard approved these changes Oct 20, 2016

View reviewed changes

etpinard mentioned this pull request Oct 20, 2016

Filter fixes #1062

Merged

etpinard merged commit e1d440d into plotly:master Oct 24, 2016

Uh oh!

Non-fancy scattergl to work with dates #1021

Non-fancy scattergl to work with dates #1021

Uh oh!

Conversation

monfera commented Oct 7, 2016 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

monfera commented Oct 11, 2016

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

etpinard left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

monfera Oct 17, 2016 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

monfera commented Oct 13, 2016

Uh oh!

alexcjohnson commented Oct 13, 2016

Uh oh!

monfera commented Oct 13, 2016

Uh oh!

alexcjohnson commented Oct 13, 2016

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

etpinard commented Oct 19, 2016

Uh oh!

etpinard commented Oct 19, 2016

Uh oh!

etpinard commented Oct 19, 2016

Uh oh!

monfera commented Oct 20, 2016

Uh oh!

etpinard left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

monfera commented Oct 7, 2016 •

edited

Loading

monfera Oct 17, 2016 •

edited

Loading

etpinard left a comment •

edited

Loading