Allow find_bad_channels_maxwell() to return scores #7845

hoechenberger · 2020-05-28T12:02:55Z

What does this implement/fix?

Inspired by mne-tools/mne-bids-pipeline#118 (comment), I have added a new parameter return_scores to preprocessing.find_bad_channels_maxwell().

If True, we will now return a dictionary with alle the scores, limits, and corresponding channel names and types.

~~- the edges of the time windows analyzed - the corresponding evaluation of a channel being flat in this window - the corresponding evaluation of a channel being noisy in this window~~

I currently refer to the latter two values as "scores", but actually they are just boolean values. So the returned ndarrays scores_flat and scores_noisy only contain True and False for each channels in all evaluated time segments. Wonder if there's a better name we could use?

Additional information

I first wanted to ask for feedback before starting to add tests.

agramfort · 2020-05-28T13:15:48Z

what do you want to do with that? how do you plan visualise this?

hoechenberger · 2020-05-28T13:27:28Z

what do you want to do with that? how do you plan visualise this?

My idea was that we could plot it with time (or segment number) on the abscissa and channels on the ordinate. Bad segments for individual channels would be colored in based on the type of bad (flat, noisy, or manually set).

My idea is to have sth that looks a little like the top-right subplot here, maybe even all of those subplots sans the spectrum:

agramfort · 2020-05-28T13:58:44Z

I think it's important to show what it can bring visually before updating public API

hoechenberger · 2020-05-28T18:06:57Z

@agramfort

I think it's important to show what it can bring visually before updating
public API

Very first very drafty draft:

I also want to add a topoplot with sensors shown in the respective colors.

Edit: The colored lines should always extend to the vertical dashed lines. There seems to be a small bug in my plotting script. But I think you can get the idea nevertheless!

drammock · 2020-05-28T18:41:06Z

@hoechenberger it seems this kind of plot could be profitably used for #7720 too. Maybe we should implement that, and use it to plot annotations from find_bads_maxwell?

hoechenberger · 2020-05-28T19:36:40Z

@hoechenberger it seems this kind of plot could be profitably used for #7720 too. Maybe we should implement that, and use it to plot annotations from find_bads_maxwell?

I had been thinking about this too. The issue with Annotations is that they extend across all channels, while here we're only interested in the temporal extent of a signal across individual channels… Any ideas how to reconcile this dichotomy?

hoechenberger · 2020-05-28T19:49:36Z

@drammock We can also talk off GH if you think it could make sense to brainstorm more quickly, just drop me a mail at [email protected]

hoechenberger · 2020-05-28T20:23:42Z

I currently refer to the latter two values as "scores", but actually they are just boolean values. So the returned ndarrays scores_flat and scores_noisy only contain True and False for each channels in all evaluated time segments. Wonder if there's a better name we could use?

Just to elaborate on this a little more. What I currently return are just the auto-detected bad segments of those channels which the algorithm found to be bad based on these segments. I simply tag these segments as "bad because flat", "bad because noisy", so it's kind of like a binary thing (bad or not), not a score as such.

We could also return the actual scores, which would make for a nice visualization: for each channel and time window, we would get a specific number (or score) that shows how "bad" that segment was. The plot would be very colorful and not as sparse as the one I posted above.

Now there would be a few issues with this approach:

Thresholding of the scores – i.e. deciding which score qualifies as "bad" – is done based on the present data. This means that absolute score values are meaningless and wouldn't help users to decide just how bad (or good!) certain segments of their data are.
We would have to create two subplots to represent the "flat" and "noisy" scores.
Also I believe that currently, the "noisy" channel detection skips all channels that had previously been identified as being "flat", so we wouldn't get any "noisy" scores for those.
In a graphical representation of all scores, it would be more difficult to find a way to make those channels that were found to be bad clearly stick out. This is the beauty of the sparse approach I presented in the figure above.

Potential solutions

Instead of absolute scores, we return the ratio score / threshold; so values <=1 would mean "does not qualify as bad", and values >1 would mean "bad segment".
Today's displays are wide, so… maybe it's not a problem to place those two plots on a horizontal grid.
Should be easy to fix.
Keep this sparse "binary" figure in addition to the score figure.

… but that's also a lot of figures :)

Thoughts?

agramfort · 2020-05-28T20:29:10Z

thoughts: play with it on real data see what is the most informative and ask feedback from users like @SophieHerbst

…

drammock · 2020-05-28T23:37:59Z

The issue with Annotations is that they extend across all channels, while here we're only interested in the temporal extent of a signal across individual channels… Any ideas how to reconcile this dichotomy?

fair point; either you would need one Annotations object per channel detected as bad, and a plotting function that took in multiple annotations objects (seems unweildy), or a single Annotations object that labels the annotations something like BAD_MAXWELL_MEG_1033 (or similar) and a plotting function that plots each annotation type on a different row / y-value of your figure. That latter option seems maybe workable for this case, and maybe even sensible behavior for what you would want for any old annotation file (i.e., BAD_EOG on one line, BAD_MOVEMENT on a different line, etc). WDYT?

agramfort · 2020-05-29T07:18:29Z

adding support for subsets of channels in annotations was requested in the past by some EEG users I know. I said for now YAGNI but we may reconsider

…

hoechenberger · 2020-05-29T08:19:19Z

@drammock

or a single Annotations object that labels the annotations something like BAD_MAXWELL_MEG_1033 (or similar) and a plotting function that plots each annotation type on a different row / y-value of your figure.

This could be a workaround, yes. But feels very hacky…

That latter option seems maybe workable for this case, and maybe even sensible behavior for what you would want for any old annotation file (i.e., BAD_EOG on one line, BAD_MOVEMENT on a different line, etc). WDYT?

I think I do understand the general concept you're proposing, but what do you mean by "old annotation file"?

@agramfort

adding support for subsets of channels in annotations was requested in the past by some EEG users I know. I said for now YAGNI but we may reconsider

I have to say that this obviously sounds like the cleanest solution to me. I'm mostly worried about how to expose this in the Raw browser plot, i.e. how to make it so that you can select a time range & add an Annotation to only a subset of channels.

agramfort · 2020-05-29T10:23:16Z

yes supporting this in UI will be tricky. I propose to sleep over it a bit more

…

drammock · 2020-05-29T14:11:52Z

what do you mean by "old annotation file"?

sorry, "any old annotation file" meant "any annotation file not created by find_bad_channels_maxwell" (i.e., one whose annotations were not channel-specific)

larsoner · 2020-06-05T11:47:51Z

Combining this with #1869 could potentially be quite useful

hoechenberger · 2020-06-05T11:56:16Z

Great idea, @larsoner!

I will pick up working again on this PR shortly.

Inspired by mne-tools/mne-bids-pipeline#118 (comment), I have added a new parameter `return_scores` to `preprocessing.find_bad_channels_maxwell()`. If True, we will now return - the edges of the time windows analyzed - the corresponding evaluation of a channel being flat in this window - the corresponding evaluation of a channel being noisy in this window I currently refer to the latter two values as "scores", but actually they are just boolean values. So the returned ndarrays `scores_flat` and `scores_noisy` only contain True and False for each channels in all evaluated time segments.

hoechenberger · 2020-06-08T17:47:25Z

@drammock After discussing with @agramfort and critically reconsidering my specific use case, I have decided that for now I will just work on the implementation of "return more elaborate scoring results" within the scope of this PR. I believe your suggestion regarding the plotting does make a lot of sense, but I would address this in a followup PR.

So what I've done now is I've changed the return value(s): When return_scores=True, we return a single dictionary scores, which currently has the keys bin_edges, scores_flat, scores_noisy, limits_flat, limits_noisy. The choice of a dict is wise because it will allow us to later alter its structure, without requiring a deprecation cycle of the public API (well… at least as long as we mark this functionality as experimental, haha)

With very little code, I can now produce helpful visualizations (I know, the colormap is not great):

    auto_noisy_chs, auto_flat_chs, scores = find_bad_channels_maxwell(
        raw=raw_lp_filtered_for_maxwell,
        calibration=config.mf_cal_fname,
        cross_talk=config.mf_ctc_fname,
        return_scores=True)

    bad_mask = np.ones_like(scores['scores_noisy'])
    bad_mask[scores['scores_noisy'] > scores['limits_noisy']] = 0

    fig, ax = plt.subplots(1, 2, figsize=(12, 5))
    sns.heatmap(scores['scores_noisy'], cmap='bwr', ax=ax[0])
    sns.heatmap(scores['scores_noisy'], mask=bad_mask, cmap='bwr', ax=ax[1])

    ax[0].set_xlabel('Data Segment', fontweight='bold')
    ax[0].set_ylabel('Channel Index', fontweight='bold')
    fig.suptitle('Automated Noisy Channel Detection', fontweight='bold')
    plt.show()

(absolute scores on the left, thresholded on the right)

WDYT?

drammock · 2020-06-08T18:22:14Z

for now I will just work on the implementation of "return more elaborate scoring results"

seems reasonable.

mne/preprocessing/maxwell.py

agramfort · 2020-06-09T08:11:41Z

you will need to add tests and expose this in an example as an illlustration.

larsoner · 2020-06-10T15:26:19Z

Can I not generate two "independent" figures somehow, in their own
or something?

Put them in separate ######## blocks and they will end up in different sections / their own single-image divs

hoechenberger · 2020-06-10T16:14:18Z

@larsoner I now only plot gradiometer results. This avoids the whole issue of figure placement. Also, having two large-scale figures (for mag and grad) would probably be too much anyway.

hoechenberger · 2020-06-10T18:13:29Z

Tutorial: https://20526-1301584-gh.circle-artifacts.com/0/dev/auto_tutorials/preprocessing/plot_60_maxwell_filtering_sss.html#using-sss-and-maxwell-filtering-in-mne-python

Example plot:

larsoner · 2020-06-10T19:04:59Z

mne/preprocessing/maxwell.py

+            "flat".
+        - ``scores_noisy`` : ndarray, shape (n_meg, n_windows)
+            The scores for testing whether MEG channels are noisy.
+        - ``limits_noisy`` : ndarray, shape (n_meg, 1)


why ndarray, shape (n_meg, 1)? It should be float, no? I see below you only set it for the good channels, but personally I would just leave it as a float (it's okay to set it for all channels because the bad ones get values of -np.inf so comparisons pass).

However, I just realized that a different, probably cleaner way around all of this: return ch_names : ndarray, shape (n_good_meg,) instead of shape (n_meg,) (and adjust all other vars, too). Then you only return what actually gets processed by the function.

Sorry for the run-around about this, hopefully just using n_good_meg makes everything cleaner...

However, I just realized that a different, probably cleaner way around all of this: return ch_names : ndarray, shape (n_good_meg,) instead of shape (n_meg,) (and adjust all other vars, too). Then you only return what actually gets processed by the function.

That's what I started out with, and then made the decision to return all meg channels, including the bad ones (even though they would not have any values set): Because I expect this to make it easier to create an interactive visualization based on the arrays returned here. I'm thinking about something like the viz I put into the tutorial, but also displaying channels from info['bads'], and where clicking on a tile of the heatmap would take you straight to the corresponding raw segment. It would be very useful to show all channels, including bads, in such an interactive visualization. And this would be easier to achieve if we got an array of the correct shape (i.e., with n_meg rows) right from the beginning…

Thoughts?

why ndarray, shape (n_meg, 1)? It should be float, no?

Also while this would work for the noisy channels, it wouldn't be applicable to the flat detection, because there we use different thresholds depending on channel type (mag, grad)

larsoner · 2020-06-11T18:36:41Z

I expect this to make it easier to create an interactive visualization based on the arrays returned here. I'm thinking about something like the viz I put into the tutorial, but also displaying channels from info['bads'], and where clicking on a tile of the heatmap would take you straight to the corresponding raw segment.

If the algorithm doesn't operate on / modify / analyze some channels, it doesn't seem like they should show up in the output at all. For example if you pass ignore_ref to maxwell_filter, those channels shouldn't show up even though they are MEG channels. Downstream viz should be built to take into account the channels that were actually processed, regardless of actual data ordering, channel selection, etc., so I don't see simplifying the viz step as sufficient motivation for including info['bads'] MEG channels in the output.

hoechenberger · 2020-06-11T19:03:30Z

Thanks for the input, @larsoner! I will discuss this with @SophieHerbst to get a better idea of what users might need and expect.

…

-- Sent from my phone, please excuse brevity and erroneous auto-correct.

On 11. Jun 2020, at 20:36, Eric Larson ***@***.***> wrote: I expect this to make it easier to create an interactive visualization based on the arrays returned here. I'm thinking about something like the viz I put into the tutorial, but also displaying channels from info['bads'], and where clicking on a tile of the heatmap would take you straight to the corresponding raw segment. If the algorithm doesn't operate on / modify / analyze some channels, it doesn't seem like they should show up in the output at all. For example if you pass ignore_ref to maxwell_filter, those channels shouldn't show up even though they are MEG channels. Downstream viz should be built to take into account the channels that were actually processed, regardless of actual data ordering, channel selection, etc., so I don't see simplifying the viz step as sufficient motivation for including info['bads'] MEG channels in the output. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or unsubscribe.

This reverts commit 5f36bd1.

hoechenberger · 2020-06-16T10:18:05Z

@larsoner
After thinking and also discussing this with @agramfort and @SophieHerbst, I have decided to:

return a score array for n_meg channels, not just for n_meg_good channels, because it makes downstream handling easier. For those channels that were marked as bad before running find_bad_channels_maxwell(), np.nan scores and limits will be returned. It's just as simple to omit those NaNs e.g. while plotting (in fact, most plotting functions will ignore them by default), and it seems more correct to have these value be np.nan than + / - np.inf. So if you have no strong objections to this – this is how I'd ike to do it.

I've also added a warning to the return_scores param, stating this feature is experimental and may break without deprecation cycle. This should give is the flexibility to adjust the feature when needed – in fact, we're still busy exploring real-world use cases.

Let me know what you think!

Otherwise, this is all good to go from my side.

larsoner · 2020-06-16T15:29:13Z

I can live with it, thanks @hoechenberger !

agramfort

whenever you have time @hoechenberger

agramfort · 2020-06-16T20:04:50Z

mne/preprocessing/maxwell.py

+            "noisy".
+
+        .. note:: The scores and limits for channels marked as ``bad`` in the
+                  input data will will be set to ``np.nan``.


will will -> will

agramfort · 2020-06-16T20:05:03Z

mne/preprocessing/maxwell.py

+        .. warning:: This feature is experimental and may change in a future
+                     version of MNE-Python without prior notice. Please
+                     report any problems and enhancement proposals to the
+                     developers.


we need ..versionadded

agramfort · 2020-06-16T20:06:12Z

tutorials/preprocessing/plot_60_maxwell_filtering_sss.py

-# channels:
+# Now we can update the list of bad channels in the dataset.
+
+bads = [*raw.info['bads'], *auto_noisy_chs, *auto_flat_chs]


bads = raw.info['bads'] + auto_noisy_chs + auto_flat_chs

seems simpler especially for non python gurus

* upstream/master: MRG, FIX: Speed up I/O tests, mark some slow (mne-tools#7904) Proper attribution for Blender tutorial (mne-tools#7900) MAINT: Check usage [ci skip] (mne-tools#7902) Allow find_bad_channels_maxwell() to return scores (mne-tools#7845) Warn if NIRx directory structure has been modified from original format (mne-tools#7898)

* upstream/master: (24 commits) WIP: Fix Travis (mne-tools#7906) WIP: Prototype of notebook viz (screencast) (mne-tools#7758) MRG, FIX: Speed up I/O tests, mark some slow (mne-tools#7904) Proper attribution for Blender tutorial (mne-tools#7900) MAINT: Check usage [ci skip] (mne-tools#7902) Allow find_bad_channels_maxwell() to return scores (mne-tools#7845) Warn if NIRx directory structure has been modified from original format (mne-tools#7898) Pin pvyista to 0.24.3 (mne-tools#7899) MRG: Add support for reading and writing sufaces to .obj (mne-tools#7824) Fix _auto_topomap_coords docstring. (mne-tools#7895) MRG, FIX: Ensure Info H5-writeable (mne-tools#7887) Website contents (mne-tools#7889) MRG, ENH: Add mri_resolution="sparse" (mne-tools#7888) MRG, ENH: Allow disabling FXAA (mne-tools#7877) remove "and and" [ci skip] (mne-tools#7882) fix evoked nave → inverse guidance (mne-tools#7881) ENH: Better error messages (mne-tools#7879) FIX : EDF+ Annotation Timestamps missing sub-second accuracy (mne-tools#7875) FIX: Fix get_channel_types (mne-tools#7878) MRG, BUG: Fix combine evokeds (mne-tools#7869) ...

* upstream/master: (23 commits) MAINT: Add mne.surface to docstring tests (mne-tools#7930) MRG: Add smoothing controller to TimeViewer for the notebook backend (mne-tools#7928) MRG: TimeViewer matplotlib figure color (mne-tools#7925) fix typos (mne-tools#7924) MRG, ENH: Add method to project onto max power ori (mne-tools#7883) WIP: Warn if untested NIRX device (mne-tools#7905) MRG, BUG: Fix bug with volume morph and subject_to!="fsaverage" (mne-tools#7896) MRG, MAINT: Clean up use of bool, float, int (mne-tools#7917) ENH: Better error message for incompatible Evoked objects (mne-tools#7910) try to fix nullcontext (mne-tools#7908) WIP: Fix Travis (mne-tools#7906) WIP: Prototype of notebook viz (screencast) (mne-tools#7758) MRG, FIX: Speed up I/O tests, mark some slow (mne-tools#7904) Proper attribution for Blender tutorial (mne-tools#7900) MAINT: Check usage [ci skip] (mne-tools#7902) Allow find_bad_channels_maxwell() to return scores (mne-tools#7845) Warn if NIRx directory structure has been modified from original format (mne-tools#7898) Pin pvyista to 0.24.3 (mne-tools#7899) MRG: Add support for reading and writing sufaces to .obj (mne-tools#7824) Fix _auto_topomap_coords docstring. (mne-tools#7895) ...

* Simplify bad channel concatenation * Add versionadded tag * Fix typo

* upstream/master: (30 commits) MRG: Add remove_labels to _Brain (mne-tools#7964) Add get_picked_points (mne-tools#7963) ENH: Add OpenGL info to mne sys_info (mne-tools#7976) [MRG] Fix reject_tmin and reject_tmax for reject_by_annotation in mne.Epochs (mne-tools#7967) mrg: Add scalar mult and div operators for AverageTFR (mne-tools#7957) MRG, MAINT: Cleaner workaround for Sphinx linking issue (mne-tools#7970) MRG, ENH: Speed up epochs.copy (mne-tools#7968) MRG, BUG: Allow ref mags to have a comp grade (mne-tools#7965) do not forget to pass adjacency (mne-tools#7961) [MRG] fix Issue with stc.project after restricting to a label (mne-tools#7950) Only process nirx event file if present (mne-tools#7951) MRG+1: BUG: info['bads'] order shouldn't matter in write_evokeds() (mne-tools#7954) Fix some small glitches introduced via mne-tools#7845 (mne-tools#7952) Add time player (mne-tools#7940) MAINT: Clean up VTK9 offset array [circle front] (mne-tools#7953) MAINT: Skip a few more on macOS (mne-tools#7948) fix links [skip travis] (mne-tools#7949) MRG, MAINT: Tweak CIs (mne-tools#7943) MRG, BUG: Fix vector scaling (mne-tools#7934) MRG, VIZ, BUG: handle CSD channel type when topo plotting (mne-tools#7935) ...

hoechenberger requested review from agramfort and larsoner and removed request for agramfort May 28, 2020 12:02

hoechenberger added 2 commits June 8, 2020 13:55

Return a dict

d59abd2

hoechenberger force-pushed the bad-segments branch from 52b38e3 to d59abd2 Compare June 8, 2020 13:17

Docstrings

7dcdda0

agramfort reviewed Jun 9, 2020

View reviewed changes

mne/preprocessing/maxwell.py Show resolved Hide resolved

hoechenberger added 2 commits June 9, 2020 11:40

Fixes

9cbae20

Docstring consistency

bbab584

Only plot gradiometers

709c458

hoechenberger added 6 commits June 10, 2020 18:15

Forgot to adjust comment

674f149

Typo

0cbb729

Use inf, -inf instead of nan

5f36bd1

Cleanup

d061258

Remove leftover, elaborate on masking

704ca7e

Fix circle

eefd90e

larsoner reviewed Jun 10, 2020

View reviewed changes

hoechenberger added 6 commits June 16, 2020 09:28

Revert "Use inf, -inf instead of nan"

cb386ce

This reverts commit 5f36bd1.

Update tutorial

b601383

Fix docstring

8bebd06

Add warning

fea5a8d

Remove double linebreak

b1cefb8

Phrasing

37775cb

larsoner merged commit 3cda359 into mne-tools:master Jun 16, 2020

hoechenberger deleted the bad-segments branch June 16, 2020 16:16

agramfort reviewed Jun 16, 2020

View reviewed changes

larsoner pushed a commit that referenced this pull request Jul 1, 2020

Fix some small glitches introduced via #7845 (#7952)

a8a8269

* Simplify bad channel concatenation * Add versionadded tag * Fix typo

drammock mentioned this pull request Jul 15, 2020

WIP: Add NIRSport support #7936

Closed

Allow find_bad_channels_maxwell() to return scores #7845

Allow find_bad_channels_maxwell() to return scores #7845

Uh oh!

Conversation

hoechenberger commented May 28, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this implement/fix?

Additional information

Uh oh!

agramfort commented May 28, 2020

Uh oh!

hoechenberger commented May 28, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

agramfort commented May 28, 2020 via email

Uh oh!

hoechenberger commented May 28, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

drammock commented May 28, 2020

Uh oh!

hoechenberger commented May 28, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

hoechenberger commented May 28, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

hoechenberger commented May 28, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

agramfort commented May 28, 2020 via email

Uh oh!

drammock commented May 28, 2020

Uh oh!

agramfort commented May 29, 2020 via email

Uh oh!

hoechenberger commented May 29, 2020

Uh oh!

agramfort commented May 29, 2020 via email

Uh oh!

drammock commented May 29, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

larsoner commented Jun 5, 2020

Uh oh!

hoechenberger commented Jun 5, 2020

Uh oh!

hoechenberger commented Jun 8, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

drammock commented Jun 8, 2020

Uh oh!

Uh oh!

agramfort commented Jun 9, 2020

Uh oh!

larsoner commented Jun 10, 2020

Uh oh!

hoechenberger commented Jun 10, 2020

Uh oh!

hoechenberger commented Jun 10, 2020

Uh oh!

larsoner Jun 10, 2020

Choose a reason for hiding this comment

Uh oh!

hoechenberger Jun 10, 2020

Choose a reason for hiding this comment

Uh oh!

hoechenberger Jun 11, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

larsoner commented Jun 11, 2020

Uh oh!

hoechenberger commented Jun 11, 2020 via email

Uh oh!

hoechenberger commented Jun 16, 2020

Uh oh!

larsoner commented Jun 16, 2020

Uh oh!

agramfort left a comment

hoechenberger commented May 28, 2020 •

edited

Loading

hoechenberger commented May 28, 2020 •

edited

Loading

hoechenberger commented May 28, 2020 •

edited

Loading

hoechenberger commented May 28, 2020 •

edited

Loading

hoechenberger commented May 28, 2020 •

edited

Loading

hoechenberger commented May 28, 2020 •

edited

Loading

drammock commented May 29, 2020 •

edited

Loading

hoechenberger commented Jun 8, 2020 •

edited

Loading

hoechenberger Jun 11, 2020 •

edited

Loading