Skip to content

ENH: Add optional argument index to pd.melt to maintain index values #33659

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 83 commits into from
Jul 9, 2020
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
83 commits
Select commit Hold shift + click to select a range
a4f2d22
initial xlsb support
Rik-de-Kort Nov 24, 2019
62564cf
Import order fix for CI pass
Rik-de-Kort Nov 25, 2019
a7a8460
Initial tests
Rik-de-Kort Nov 26, 2019
d9be281
style fixes
Rik-de-Kort Nov 28, 2019
8bf8c78
documentation
Rik-de-Kort Nov 28, 2019
cd95dce
forgot place to document
Rik-de-Kort Nov 28, 2019
7a7390d
Fixed test issue with XLRDError
Rik-de-Kort Nov 30, 2019
248ac12
Fix for unnamed column issue
Rik-de-Kort Nov 30, 2019
6ea78de
style fix
Rik-de-Kort Dec 1, 2019
44c5439
line up with upstream master
Rik-de-Kort Dec 1, 2019
92c98cd
Merge branch 'master' of https://github.com/pandas-dev/pandas
Rik-de-Kort Dec 1, 2019
64fa6f3
Fix broken xlrd test
Rik-de-Kort Dec 2, 2019
cb276e8
get docs to build
Rik-de-Kort Dec 2, 2019
4ebcb48
Remove warning filter
Rik-de-Kort Dec 6, 2019
71436a0
Merge branch 'master' of https://github.com/Rik-de-Kort/pandas
Rik-de-Kort Dec 6, 2019
00cc66b
extended description update
Rik-de-Kort Dec 7, 2019
4c81853
Merge branch 'master' of https://github.com/pandas-dev/pandas
Rik-de-Kort Dec 7, 2019
e85da03
Xlsb options instead of odf options
Rik-de-Kort Dec 9, 2019
2348c3b
Add reference in whatsnew to docs
Rik-de-Kort Dec 11, 2019
d02a5a5
Make pyxlsb show up in install.rst and show_versions
Rik-de-Kort Dec 11, 2019
c71e021
Add pyxlsb to ci builds
Rik-de-Kort Dec 14, 2019
ae3f9ea
environment.yml update
Rik-de-Kort Dec 14, 2019
a410e51
Merge upstream master
Rik-de-Kort Dec 15, 2019
7c9dcce
One update to environment.yml too many
Rik-de-Kort Dec 19, 2019
4bd8400
Trying to fix build
Rik-de-Kort Dec 23, 2019
43ab0fe
Merge upstream
Rik-de-Kort Jan 15, 2020
024492a
Added issue number
Rik-de-Kort Jan 15, 2020
b424c8e
Updated to use .rows(sparse=False) for future compat
Rik-de-Kort Jan 15, 2020
571489b
Merge branch 'master' of https://github.com/pandas-dev/pandas
Rik-de-Kort Jan 17, 2020
dad4a53
xfails in test_readers.py
Rik-de-Kort Jan 17, 2020
9b6bc9a
xfail url loads
Rik-de-Kort Jan 18, 2020
b92348e
Merge branch 'master' of https://github.com/pandas-dev/pandas
Rik-de-Kort Jan 22, 2020
c2cbfd7
Update min version
Rik-de-Kort Jan 22, 2020
799bb28
test xfailing for the right reason
Rik-de-Kort Jan 22, 2020
b97c4ae
xfail unnecessary due to consistency check only
Rik-de-Kort Jan 22, 2020
10c7cde
Merge branch 'master' of https://github.com/pandas-dev/pandas
Rik-de-Kort Mar 23, 2020
594fea4
Merge branch 'master' of https://github.com/Rik-de-Kort/pandas
Rik-de-Kort Apr 19, 2020
dda5657
Initial fix over from original repo.
Rik-de-Kort Apr 19, 2020
1233b9e
Proper implementation
Rik-de-Kort Apr 19, 2020
0f45e4d
Multiindex support
Rik-de-Kort Apr 19, 2020
9e8eaac
Merge branch 'master' of https://github.com/pandas-dev/pandas
Rik-de-Kort Apr 19, 2020
bb179b7
Updated docstring
Rik-de-Kort Apr 19, 2020
4161ede
Whatsnew entry
Rik-de-Kort Apr 19, 2020
7ae7261
Fix mypy error
Rik-de-Kort Apr 20, 2020
7cee87d
Merge branch 'master' of https://github.com/pandas-dev/pandas
Rik-de-Kort Apr 21, 2020
0455f64
renamed kwarg, docstring
Rik-de-Kort Apr 21, 2020
d5fb84e
updated reshaping.rst
Rik-de-Kort Apr 21, 2020
d65e836
PEP8 issue
Rik-de-Kort Apr 21, 2020
f782fee
PEP8 issue
Rik-de-Kort Apr 21, 2020
14062aa
Updated whatsnew
Rik-de-Kort Apr 21, 2020
39bd069
trailing whitespace in reshaping.rst
Rik-de-Kort Apr 21, 2020
0bc198c
Resolve comments
Rik-de-Kort Apr 22, 2020
e67c842
Maybe this will fix doc linting?
Rik-de-Kort Apr 22, 2020
666a856
Names and types
Rik-de-Kort Apr 26, 2020
250fa5e
Merge upstream master
Rik-de-Kort Apr 26, 2020
fc7e50b
Merging master, again
Rik-de-Kort Apr 28, 2020
f3b1dca
Update doc/source/user_guide/reshaping.rst
Rik-de-Kort Apr 28, 2020
8316cbf
Update pandas/core/frame.py
Rik-de-Kort Apr 28, 2020
2b6ec46
Merge branch 'master' of https://github.com/pandas-dev/pandas
Rik-de-Kort May 1, 2020
7ac9aa5
Merge branch 'master' of https://github.com/Rik-de-Kort/pandas
Rik-de-Kort May 1, 2020
118a15d
Merge upstream master
Rik-de-Kort May 17, 2020
64afbd0
Reviewer comments
Rik-de-Kort May 17, 2020
a1ecb46
transform note?
Rik-de-Kort May 17, 2020
551e40f
Test full frame
Rik-de-Kort May 17, 2020
a462afd
Linting
Rik-de-Kort May 17, 2020
b1bf46d
Merge branch 'master' of https://github.com/pandas-dev/pandas
Rik-de-Kort May 23, 2020
368bfa5
Documentation fixes
Rik-de-Kort May 23, 2020
bf7d5e5
Fixed documentation
Rik-de-Kort May 26, 2020
9008ccc
Merge branch 'master' of https://github.com/pandas-dev/pandas
Rik-de-Kort Jun 1, 2020
a6ec490
Merge branch 'master' of https://github.com/pandas-dev/pandas
Rik-de-Kort Jun 8, 2020
0391b7a
Merge branch 'master' of https://github.com/Rik-de-Kort/pandas
Rik-de-Kort Jun 8, 2020
800c050
Fixed docs (Hopefully)
Rik-de-Kort Jun 8, 2020
788c28a
Merge upstream master
Rik-de-Kort Jun 25, 2020
e134ed2
Hopefully fix documentation bug
Rik-de-Kort Jun 25, 2020
7a765a3
Fix typing error
Rik-de-Kort Jun 25, 2020
b1cca84
Apply suggestions from code review
Rik-de-Kort Jun 25, 2020
c66767d
Doc review:
Rik-de-Kort Jun 25, 2020
7f5018f
Type!
Rik-de-Kort Jun 25, 2020
16e9bd4
Added example for difference
Rik-de-Kort Jun 26, 2020
df645e1
Merge branch 'master' of https://github.com/pandas-dev/pandas
Rik-de-Kort Jun 26, 2020
bbf8465
Linting failure?
Rik-de-Kort Jun 26, 2020
57bffd1
TomAugspurger suggestion
Rik-de-Kort Jul 7, 2020
edcd123
Trailing whitespace...
Rik-de-Kort Jul 7, 2020
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
16 changes: 16 additions & 0 deletions doc/source/user_guide/reshaping.rst
Original file line number Diff line number Diff line change
Expand Up @@ -296,6 +296,22 @@ For instance,
cheese.melt(id_vars=['first', 'last'])
cheese.melt(id_vars=['first', 'last'], var_name='quantity')

When transforming a DataFrame using :func:`~pandas.melt`, the index will be ignored. The original index values can be kept around by setting the ``ignore_index`` parameter to ``False`` (default is ``True``). This will however duplicate them.

.. versionadded:: 1.1.0

.. ipython:: python

index = pd.MultiIndex.from_tuples([('person', 'A'), ('person', 'B')])
cheese = pd.DataFrame({'first': ['John', 'Mary'],
'last': ['Doe', 'Bo'],
'height': [5.5, 6.0],
'weight': [130, 150]},
index=index)
cheese
cheese.melt(id_vars=['first', 'last'])
cheese.melt(id_vars=['first', 'last'], ignore_index=False)

Another way to transform is to use the :func:`~pandas.wide_to_long` panel data
convenience function. It is less flexible than :func:`~pandas.melt`, but more
user-friendly.
Expand Down
2 changes: 2 additions & 0 deletions doc/source/whatsnew/v1.1.0.rst
Original file line number Diff line number Diff line change
Expand Up @@ -287,6 +287,7 @@ Other enhancements
This can be used to set a custom compression level, e.g.,
``df.to_csv(path, compression={'method': 'gzip', 'compresslevel': 1}``
(:issue:`33196`)
- :meth:`melt` has gained an ``ignore_index`` (default ``True``) argument that, if set to ``False``, prevents the method from dropping the index (:issue:`17440`).
- :meth:`Series.update` now accepts objects that can be coerced to a :class:`Series`,
such as ``dict`` and ``list``, mirroring the behavior of :meth:`DataFrame.update` (:issue:`33215`)
- :meth:`~pandas.core.groupby.GroupBy.transform` and :meth:`~pandas.core.groupby.GroupBy.aggregate` has gained ``engine`` and ``engine_kwargs`` arguments that supports executing functions with ``Numba`` (:issue:`32854`, :issue:`33388`)
Expand Down Expand Up @@ -1143,3 +1144,4 @@ Other

Contributors
~~~~~~~~~~~~

1 change: 0 additions & 1 deletion pandas/core/apply.py
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,6 @@

from pandas._config import option_context

from pandas._libs import reduction as libreduction
from pandas._typing import Axis
from pandas.util._decorators import cache_readonly

Expand Down
4 changes: 3 additions & 1 deletion pandas/core/frame.py
Original file line number Diff line number Diff line change
Expand Up @@ -2140,7 +2140,7 @@ def to_stata(
from pandas.io.stata import StataWriter117 as statawriter # type: ignore
else: # versions 118 and 119
# mypy: Name 'statawriter' already defined (possibly by an import)
from pandas.io.stata import StataWriterUTF8 as statawriter # type:ignore
from pandas.io.stata import StataWriterUTF8 as statawriter # type: ignore

kwargs: Dict[str, Any] = {}
if version is None or version >= 117:
Expand Down Expand Up @@ -7086,6 +7086,7 @@ def melt(
var_name=None,
value_name="value",
col_level=None,
ignore_index=True,
) -> "DataFrame":

return melt(
Expand All @@ -7095,6 +7096,7 @@ def melt(
var_name=var_name,
value_name=value_name,
col_level=col_level,
ignore_index=ignore_index,
)

# ----------------------------------------------------------------------
Expand Down
10 changes: 8 additions & 2 deletions pandas/core/reshape/melt.py
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,7 @@
import pandas.core.common as com
from pandas.core.indexes.api import Index, MultiIndex
from pandas.core.reshape.concat import concat
from pandas.core.reshape.util import _tile_compat
from pandas.core.shared_docs import _shared_docs
from pandas.core.tools.numeric import to_numeric

Expand All @@ -31,8 +32,8 @@ def melt(
var_name=None,
value_name="value",
col_level=None,
ignore_index: bool = True,
) -> "DataFrame":
# TODO: what about the existing index?
# If multiindex, gather names of columns on all level for checking presence
# of `id_vars` and `value_vars`
if isinstance(frame.columns, MultiIndex):
Expand Down Expand Up @@ -121,7 +122,12 @@ def melt(
# asanyarray will keep the columns as an Index
mdata[col] = np.asanyarray(frame.columns._get_level_values(i)).repeat(N)

return frame._constructor(mdata, columns=mcolumns)
result = frame._constructor(mdata, columns=mcolumns)

if not ignore_index:
result.index = _tile_compat(frame.index, K)

return result


@deprecate_kwarg(old_arg_name="label", new_arg_name=None)
Expand Down
16 changes: 16 additions & 0 deletions pandas/core/shared_docs.py
Original file line number Diff line number Diff line change
Expand Up @@ -28,6 +28,11 @@
Name to use for the 'value' column.
col_level : int or str, optional
If columns are a MultiIndex then use this level to melt.
ignore_index : bool, default True
If True, original index is ignored. If False, the original index is retained.
Index labels will be repeated as necessary.

.. versionadded:: 1.1.0

Returns
-------
Expand Down Expand Up @@ -78,6 +83,17 @@
1 b B 3
2 c B 5

Original index values can be kept around:

>>> %(caller)sid_vars=['A'], value_vars=['B', 'C'], ignore_index=False)
A variable value
0 a B 1
1 b B 3
2 c B 5
0 a C 2
1 b C 4
2 c C 6

If you have multi-index columns:

>>> df.columns = [list('ABC'), list('DEF')]
Expand Down
41 changes: 41 additions & 0 deletions pandas/tests/reshape/test_melt.py
Original file line number Diff line number Diff line change
Expand Up @@ -357,6 +357,47 @@ def test_melt_mixed_int_str_value_vars(self):
expected = DataFrame({"variable": [0, "a"], "value": ["foo", "bar"]})
tm.assert_frame_equal(result, expected)

def test_ignore_index(self):
# GH 17440
df = DataFrame({"foo": [0], "bar": [1]}, index=["first"])
result = melt(df, ignore_index=False)
expected = DataFrame(
{"variable": ["foo", "bar"], "value": [0, 1]}, index=["first", "first"]
)
tm.assert_frame_equal(result, expected)

def test_ignore_multiindex(self):
# GH 17440
index = pd.MultiIndex.from_tuples(
[("first", "second"), ("first", "third")], names=["baz", "foobar"]
)
df = DataFrame({"foo": [0, 1], "bar": [2, 3]}, index=index)
result = melt(df, ignore_index=False)

expected_index = pd.MultiIndex.from_tuples(
[("first", "second"), ("first", "third")] * 2, names=["baz", "foobar"]
)
expected = DataFrame(
{"variable": ["foo"] * 2 + ["bar"] * 2, "value": [0, 1, 2, 3]},
index=expected_index,
)

tm.assert_frame_equal(result, expected)

def test_ignore_index_name_and_type(self):
# GH 17440
index = pd.Index(["foo", "bar"], dtype="category", name="baz")
df = DataFrame({"x": [0, 1], "y": [2, 3]}, index=index)
result = melt(df, ignore_index=False)

expected_index = pd.Index(["foo", "bar"] * 2, dtype="category", name="baz")
expected = DataFrame(
{"variable": ["x", "x", "y", "y"], "value": [0, 1, 2, 3]},
index=expected_index,
)

tm.assert_frame_equal(result, expected)


class TestLreshape:
def test_pairs(self):
Expand Down