Skip to content
Merged
5 changes: 5 additions & 0 deletions doc/source/reference/io.rst
Original file line number Diff line number Diff line change
Expand Up @@ -83,6 +83,11 @@ HDFStore: PyTables (HDF5)
HDFStore.groups
HDFStore.walk

.. warning::

One can store a subclass of ``DataFrame`` or ``Series`` to HDF5,
but the type of the subclass is lost upon storing.

Feather
~~~~~~~
.. autosummary::
Expand Down
3 changes: 2 additions & 1 deletion doc/source/whatsnew/v1.3.0.rst
Original file line number Diff line number Diff line change
Expand Up @@ -241,7 +241,8 @@ I/O
- Bug in :func:`read_csv` not accepting ``usecols`` with different length than ``names`` for ``engine="python"`` (:issue:`16469`)
- Bug in :func:`read_csv` raising ``TypeError`` when ``names`` and ``parse_dates`` is specified for ``engine="c"`` (:issue:`33699`)
- Allow custom error values for parse_dates argument of :func:`read_sql`, :func:`read_sql_query` and :func:`read_sql_table` (:issue:`35185`)
-
- Bug in :func:`to_hdf` raising ``KeyError`` when trying to apply
for subclasses of ``DataFrame`` or ``Series`` (:issue:`33748`).

Period
^^^^^^
Expand Down
5 changes: 5 additions & 0 deletions pandas/core/generic.py
Original file line number Diff line number Diff line change
Expand Up @@ -2505,6 +2505,11 @@ def to_hdf(
In order to add another DataFrame or Series to an existing HDF file
please use append mode and a different a key.

.. warning::

One can store a subclass of ``DataFrame`` or ``Series`` to HDF5,
but the type of the subclass is lost upon storing.

For more information see the :ref:`user guide <io.hdf5>`.

Parameters
Expand Down
6 changes: 4 additions & 2 deletions pandas/io/pytables.py
Original file line number Diff line number Diff line change
Expand Up @@ -1646,8 +1646,10 @@ def error(t):
"nor a value are passed"
)
else:
_TYPE_MAP = {Series: "series", DataFrame: "frame"}
pt = _TYPE_MAP[type(value)]
if isinstance(value, Series):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

total nit but black permitting we could do

pt = "series" if isinstance(value, Series) else "frame"

pt = "series"
else:
pt = "frame"

# we are actually a table
if format == "table":
Expand Down
44 changes: 44 additions & 0 deletions pandas/tests/io/pytables/test_subclass.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,44 @@
import numpy as np

from pandas import DataFrame, Series
import pandas._testing as tm
from pandas.tests.io.pytables.common import ensure_clean_path

from pandas.io.pytables import HDFStore, read_hdf


class TestHDFStoreSubclass:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in pandas._testing we have SubclassedSeries and SubclassedDataFrame. Should we be using those here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for pointing this out!
Changed that.

# GH 33748
def test_supported_for_subclass_dataframe(self):
data = {"a": [1, 2], "b": [3, 4]}
sdf = tm.SubclassedDataFrame(data, dtype=np.intp)

expected = DataFrame(data, dtype=np.intp)

with ensure_clean_path("temp.h5") as path:
sdf.to_hdf(path, "df")
result = read_hdf(path, "df")
tm.assert_frame_equal(result, expected)

with ensure_clean_path("temp.h5") as path:
with HDFStore(path) as store:
store.put("df", sdf)
result = read_hdf(path, "df")
tm.assert_frame_equal(result, expected)

def test_supported_for_subclass_series(self):
data = [1, 2, 3]
sser = tm.SubclassedSeries(data, dtype=np.intp)

expected = Series(data, dtype=np.intp)

with ensure_clean_path("temp.h5") as path:
sser.to_hdf(path, "ser")
result = read_hdf(path, "ser")
tm.assert_series_equal(result, expected)

with ensure_clean_path("temp.h5") as path:
with HDFStore(path) as store:
store.put("ser", sser)
result = read_hdf(path, "ser")
tm.assert_series_equal(result, expected)