You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I often need to plot a heatmap of a DataFrame which uses an IntervalIndex as its columns (and, usually, time as its index). Such a plot could also be called a "dynamic spectrum" or "2D histogram" and is used to quickly get an idea of how a spectrum develops over time.
This is slightly different from what is usually considered as a heatmap (see #19008 for an example) as the bins are not necessarily equidistant and there is not necessarily a separate label for each bin. The y axis (which is used for the IntervalIndex) could even have logarithmic scaling.
Describe the solution you'd like
This could use the same API df.plot(type='heatmap') as suggested in #19008 and switch between appropriate axis scaling/labeling modes depending on whether a CategoricalIndex, IntervalIndex or other types of indices are used.
Describe alternatives you've considered
My current implementation (see below) uses matplotlib's pcolormesh, but needs to do some fiddling with the bin edges to work correctly.
Matplotlib's hist2d does not work for this use case, because the data is already stored in histogrammed form - the histogram and its bins don't need to be calculated, just plotted.
Seaborn's heatmap function seems to be limited to plotting categorical data, so both IntervalIndex and DatetimeIndex are displayed as categorical data with one label per bin, equidistant spacing, and values on the y axis sorted from top to bottom instead of bottom to top:
Additional context
My current implementation looks similar to this:
binedges=np.append(df.columns.left, df.columns.right[-1])
X, Y=np.meshgrid(df.index, binedges)
pcm=ax.pcolormesh(X, Y, df.values.T)
# then add labels, colorbar etc.
This only works if the IntervalIndex has no gaps and is non-overlapping, which would have to be checked first.
The text was updated successfully, but these errors were encountered:
I'm not sure this is a satisfactory solution, but I wanted to share it anyway since it does solve your problem in a neat way. However, it requires you to use a different library for visualization so I appreciate if that's not workable for you.
I'm using Altair but I imagine the ggplot package can do something similar. Here's a pic. Note that it can handle missing data, without distorting the axes!
And here is the code:
importnumpyasnpimportpandasaspdimportaltairasalt# Generate some datadf=pd.DataFrame(np.random.rand(10, 10), index=pd.date_range("2020-04-19", periods=10, freq="D"), columns=pd.interval_range(start=0, end=1, periods=10))
df=df.drop(index=df.index[4], columns=df.columns[6])
# Reshape frame for visualization.# We cast intervals to simple tuples of their endpoints,# "melt" the dataframe and unpack the tuples so we# end up with a frame of the form [timestamp, value, left_endpoint, right_endpoint]tidy=dftidy.columns= [(interval.left, interval.right) forintervalintidy.columns]
tidy=tidy.reset_index() # Reset index necessary because pd.melt drops index, see #17440tidy=tidy.melt(id_vars="index", var_name="energy")
tidy[["energy0", "energy1"]] =pd.DataFrame(tidy.energy.tolist())
# Visualize.alt.Chart(tidy).mark_rect().encode(x="monthdate(index)", y="energy0", y2="energy1", color="value")
@Rik-de-Kort This solution looks awesome! However, when I try to run your code, there is nothing shown on my screen. Do you know how to solve that? Thanks
@Rik-de-Kort This solution looks awesome! However, when I try to run your code, there is nothing shown on my screen. Do you know how to solve that? Thanks
Ah yes, add .serve() to the end of the chart, that will start a renderer. I didn't include it in case the user was in a notebook.
Is your feature request related to a problem?
I often need to plot a heatmap of a DataFrame which uses an
IntervalIndex
as its columns (and, usually, time as its index). Such a plot could also be called a "dynamic spectrum" or "2D histogram" and is used to quickly get an idea of how a spectrum develops over time.This is slightly different from what is usually considered as a heatmap (see #19008 for an example) as the bins are not necessarily equidistant and there is not necessarily a separate label for each bin. The y axis (which is used for the
IntervalIndex
) could even have logarithmic scaling.Describe the solution you'd like
This could use the same API
df.plot(type='heatmap')
as suggested in #19008 and switch between appropriate axis scaling/labeling modes depending on whether aCategoricalIndex
,IntervalIndex
or other types of indices are used.Describe alternatives you've considered
My current implementation (see below) uses matplotlib's
pcolormesh
, but needs to do some fiddling with the bin edges to work correctly.Matplotlib's
hist2d
does not work for this use case, because the data is already stored in histogrammed form - the histogram and its bins don't need to be calculated, just plotted.Seaborn's
heatmap
function seems to be limited to plotting categorical data, so bothIntervalIndex
andDatetimeIndex
are displayed as categorical data with one label per bin, equidistant spacing, and values on the y axis sorted from top to bottom instead of bottom to top:Additional context
My current implementation looks similar to this:
This only works if the
IntervalIndex
has no gaps and is non-overlapping, which would have to be checked first.The text was updated successfully, but these errors were encountered: