diff --git a/doc/source/_static/rplot-seaborn-example1.png b/doc/source/_static/rplot-seaborn-example1.png new file mode 100644 index 0000000000000..d19a3a018bfbf Binary files /dev/null and b/doc/source/_static/rplot-seaborn-example1.png differ diff --git a/doc/source/_static/rplot-seaborn-example2.png b/doc/source/_static/rplot-seaborn-example2.png new file mode 100644 index 0000000000000..9293082e78129 Binary files /dev/null and b/doc/source/_static/rplot-seaborn-example2.png differ diff --git a/doc/source/_static/rplot-seaborn-example3.png b/doc/source/_static/rplot-seaborn-example3.png new file mode 100644 index 0000000000000..8fd311acbd528 Binary files /dev/null and b/doc/source/_static/rplot-seaborn-example3.png differ diff --git a/doc/source/_static/rplot-seaborn-example3b.png b/doc/source/_static/rplot-seaborn-example3b.png new file mode 100644 index 0000000000000..4bfbac574ef29 Binary files /dev/null and b/doc/source/_static/rplot-seaborn-example3b.png differ diff --git a/doc/source/_static/rplot-seaborn-example4.png b/doc/source/_static/rplot-seaborn-example4.png new file mode 100644 index 0000000000000..8e08c7e86178a Binary files /dev/null and b/doc/source/_static/rplot-seaborn-example4.png differ diff --git a/doc/source/_static/rplot-seaborn-example6.png b/doc/source/_static/rplot-seaborn-example6.png new file mode 100644 index 0000000000000..0fa56f4a018e7 Binary files /dev/null and b/doc/source/_static/rplot-seaborn-example6.png differ diff --git a/doc/source/visualization.rst b/doc/source/visualization.rst index 6bef7f6f456c8..852397c355361 100644 --- a/doc/source/visualization.rst +++ b/doc/source/visualization.rst @@ -1607,6 +1607,16 @@ when plotting a large number of points. Trellis plotting interface -------------------------- +.. warning:: + + The ``rplot`` trellis plotting interface is **deprecated and will be removed + in a future version**. We refer to external packages like + `seaborn `_ for similar but more + refined functionality. + + The docs below include some example on how to convert your existing code to + ``seaborn``. + .. ipython:: python :suppress: @@ -1622,7 +1632,6 @@ Trellis plotting interface iris_data = read_csv('data/iris.data') from pandas import read_csv from pandas.tools.plotting import radviz - import pandas.tools.rplot as rplot plt.close('all') @@ -1641,13 +1650,20 @@ Trellis plotting interface We import the rplot API: .. ipython:: python + :okwarning: import pandas.tools.rplot as rplot Examples ~~~~~~~~ -RPlot is a flexible API for producing Trellis plots. These plots allow you to arrange data in a rectangular grid by values of certain attributes. +RPlot was an API for producing Trellis plots. These plots allow you toµ +arrange data in a rectangular grid by values of certain attributes. +In the example below, data from the tips data set is arranged by the attributes +'sex' and 'smoker'. Since both of those attributes can take on one of two +values, the resulting grid has two columns and two rows. A histogram is +displayed for each cell of the grid. + .. ipython:: python @@ -1665,7 +1681,20 @@ RPlot is a flexible API for producing Trellis plots. These plots allow you to ar plt.close('all') -In the example above, data from the tips data set is arranged by the attributes 'sex' and 'smoker'. Since both of those attributes can take on one of two values, the resulting grid has two columns and two rows. A histogram is displayed for each cell of the grid. +A similar plot can be made with ``seaborn`` using the ``FacetGrid`` object, +resulting in the following image: + +.. code-block:: python + + import seaborn as sns + g = sns.FacetGrid(tips_data, row="sex", col="smoker") + g.map(plt.hist, "total_bill") + +.. image:: _static/rplot-seaborn-example1.png + + +Example below is the same as previous except the plot is set to kernel density +estimation. A ``seaborn`` example is included beneath. .. ipython:: python @@ -1683,7 +1712,15 @@ In the example above, data from the tips data set is arranged by the attributes plt.close('all') -Example above is the same as previous except the plot is set to kernel density estimation. This shows how easy it is to have different plots for the same Trellis structure. +.. code-block:: python + + g = sns.FacetGrid(tips_data, row="sex", col="smoker") + g.map(sns.kdeplot, "total_bill") + +.. image:: _static/rplot-seaborn-example2.png + +The plot below shows that it is possible to have two or more plots for the same +data displayed on the same Trellis grid cell. .. ipython:: python @@ -1702,7 +1739,27 @@ Example above is the same as previous except the plot is set to kernel density e plt.close('all') -The plot above shows that it is possible to have two or more plots for the same data displayed on the same Trellis grid cell. +A seaborn equivalent for a simple scatter plot: + +.. code-block:: python + + g = sns.FacetGrid(tips_data, row="sex", col="smoker") + g.map(plt.scatter, "total_bill", "tip") + +.. image:: _static/rplot-seaborn-example3.png + +and with a regression line, using the dedicated ``seaborn`` ``regplot`` function: + +.. code-block:: python + + g = sns.FacetGrid(tips_data, row="sex", col="smoker", margin_titles=True) + g.map(sns.regplot, "total_bill", "tip", order=2) + +.. image:: _static/rplot-seaborn-example3b.png + + +Below is a similar plot but with 2D kernel density estimation plot superimposed, +followed by a ``seaborn`` equivalent: .. ipython:: python @@ -1721,7 +1778,17 @@ The plot above shows that it is possible to have two or more plots for the same plt.close('all') -Above is a similar plot but with 2D kernel density estimation plot superimposed. +.. code-block:: python + + g = sns.FacetGrid(tips_data, row="sex", col="smoker") + g.map(plt.scatter, "total_bill", "tip") + g.map(sns.kdeplot, "total_bill", "tip") + +.. image:: _static/rplot-seaborn-example4.png + +It is possible to only use one attribute for grouping data. The example above +only uses 'sex' attribute. If the second grouping attribute is not specified, +the plots will be arranged in a column. .. ipython:: python @@ -1739,7 +1806,7 @@ Above is a similar plot but with 2D kernel density estimation plot superimposed. plt.close('all') -It is possible to only use one attribute for grouping data. The example above only uses 'sex' attribute. If the second grouping attribute is not specified, the plots will be arranged in a column. +If the first grouping attribute is not specified the plots will be arranged in a row. .. ipython:: python @@ -1757,16 +1824,18 @@ It is possible to only use one attribute for grouping data. The example above on plt.close('all') -If the first grouping attribute is not specified the plots will be arranged in a row. +In ``seaborn``, this can also be done by only specifying one of the ``row`` +and ``col`` arguments. + +In the example below the colour and shape of the scatter plot graphical +objects is mapped to 'day' and 'size' attributes respectively. You use +scale objects to specify these mappings. The list of scale classes is +given below with initialization arguments for quick reference. .. ipython:: python plt.figure() - plot = rplot.RPlot(tips_data, x='total_bill', y='tip') - plot.add(rplot.TrellisGrid(['.', 'smoker'])) - plot.add(rplot.GeomHistogram()) - plot = rplot.RPlot(tips_data, x='tip', y='total_bill') plot.add(rplot.TrellisGrid(['sex', 'smoker'])) plot.add(rplot.GeomPoint(size=80.0, colour=rplot.ScaleRandomColour('day'), shape=rplot.ScaleShape('size'), alpha=1.0)) @@ -1779,38 +1848,12 @@ If the first grouping attribute is not specified the plots will be arranged in a plt.close('all') -As shown above, scatter plots are also possible. Scatter plots allow you to map various data attributes to graphical properties of the plot. In the example above the colour and shape of the scatter plot graphical objects is mapped to 'day' and 'size' attributes respectively. You use scale objects to specify these mappings. The list of scale classes is given below with initialization arguments for quick reference. - - -Scales -~~~~~~ - -:: - - ScaleGradient(column, colour1, colour2) - -This one allows you to map an attribute (specified by parameter column) value to the colour of a graphical object. The larger the value of the attribute the closer the colour will be to colour2, the smaller the value, the closer it will be to colour1. - -:: - - ScaleGradient2(column, colour1, colour2, colour3) - -The same as ScaleGradient but interpolates linearly between three colours instead of two. - -:: - - ScaleSize(column, min_size, max_size, transform) - -Map attribute value to size of the graphical object. Parameter min_size (default 5.0) is the minimum size of the graphical object, max_size (default 100.0) is the maximum size and transform is a one argument function that will be used to transform the attribute value (defaults to lambda x: x). - -:: - - ScaleShape(column) - -Map the shape of the object to attribute value. The attribute has to be categorical. +This can also be done in ``seaborn``, at least for 3 variables: -:: +.. code-block:: python - ScaleRandomColour(column) + g = sns.FacetGrid(tips_data, row="sex", col="smoker", hue="day") + g.map(plt.scatter, "tip", "total_bill") + g.add_legend() -Assign a random colour to a value of categorical attribute specified by column. +.. image:: _static/rplot-seaborn-example6.png diff --git a/doc/source/whatsnew/v0.16.0.txt b/doc/source/whatsnew/v0.16.0.txt index d6c9b3ed1a6c6..6fcbf5f9eabf0 100644 --- a/doc/source/whatsnew/v0.16.0.txt +++ b/doc/source/whatsnew/v0.16.0.txt @@ -365,10 +365,25 @@ The behavior of a small sub-set of edge cases for using ``.loc`` have changed (: In [4]: df.loc[2:3] TypeError: Cannot do slice indexing on with keys + +.. _whatsnew_0160.deprecations: + Deprecations ~~~~~~~~~~~~ -.. _whatsnew_0160.deprecations: +- The ``rplot`` trellis plotting interface is deprecated and will be removed + in a future version. We refer to external packages like + `seaborn `_ for similar + but more refined functionality (:issue:`3445`). + + The documentation includes some examples how to convert your existing code + using ``rplot`` to seaborn: - :ref:`rplot docs ` + + +.. _whatsnew_0160.prior_deprecations: + +Removal of prior version deprecations/changes +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - ``DataFrame.pivot_table`` and ``crosstab``'s ``rows`` and ``cols`` keyword arguments were removed in favor of ``index`` and ``columns`` (:issue:`6581`) diff --git a/pandas/tools/rplot.py b/pandas/tools/rplot.py index 1c3d17ee908cb..c3c71ab749536 100644 --- a/pandas/tools/rplot.py +++ b/pandas/tools/rplot.py @@ -1,4 +1,5 @@ import random +import warnings from copy import deepcopy from pandas.core.common import _values_from_object @@ -9,6 +10,16 @@ # * Make sure legends work properly # + +warnings.warn("\n" + "The rplot trellis plotting interface is deprecated and will be " + "removed in a future version. We refer to external packages " + "like seaborn for similar but more refined functionality. \n\n" + "See our docs http://pandas.pydata.org/pandas-docs/stable/visualization.html#rplot " + "for some example how to convert your existing code to these " + "packages.", FutureWarning) + + class Scale: """ Base class for mapping between graphical and data attributes.