Skip to content

HDFStore.select iterator not working #12953

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
cemsbr opened this issue Apr 22, 2016 · 2 comments
Open

HDFStore.select iterator not working #12953

cemsbr opened this issue Apr 22, 2016 · 2 comments
Labels
Bug IO HDF5 read_hdf, HDFStore

Comments

@cemsbr
Copy link
Contributor

cemsbr commented Apr 22, 2016

Code

store = pd.HDFStore('/tmp/tmp.h5')
dfq = pd.DataFrame(np.random.randn(10,4), columns=list('ABCD'), index=pd.date_range('20130101',periods=10))
store.append('dfq', dfq, format='table')
store.select('dfq', where="columns=['A']", chunksize=3)  # or iterator=True

Error

---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
<ipython-input-202-a39cd8f01057> in <module>()
----> 1 store.select('dfq', where="columns=['A']", chunksize=3)

/usr/lib/python3.5/site-packages/pandas/io/pytables.py in select(self, key, where, start, stop, columns, iterator, chunksize, a
uto_close, **kwargs)
    678                            chunksize=chunksize, auto_close=auto_close)
    679 
--> 680         return it.get_result()
    681 
    682     def select_as_coordinates(

/usr/lib/python3.5/site-packages/pandas/io/pytables.py in get_result(self, coordinates)
   1351                     "can only use an iterator or chunksize on a table")
   1352 
-> 1353             self.coordinates = self.s.read_coordinates(where=self.where)
   1354 
   1355             return self

/usr/lib/python3.5/site-packages/pandas/io/pytables.py in read_coordinates(self, where, start, stop, **kwargs)
   3581             for field, op, filt in self.selection.filter.format():
   3582                 data = self.read_column(
-> 3583                     field, start=coords.min(), stop=coords.max() + 1)
   3584                 coords = coords[
   3585                     op(data.iloc[coords - coords.min()], filt).values]

/usr/lib/python3.5/site-packages/pandas/io/pytables.py in read_column(self, column, where, start, stop, **kwargs)
   3621                                       a.tz, True), name=column)
   3622 
-> 3623         raise KeyError("column [%s] not found in the table" % column)
   3624 
   3625 

KeyError: 'column [columns] not found in the table'

output of pd.show_versions()

INSTALLED VERSIONS
------------------
commit: None
python: 3.5.1.final.0
python-bits: 64
OS: Linux
OS-release: 4.5.0-1-ARCH
machine: x86_64
processor: 
byteorder: little
LC_ALL: None
LANG: en_US.UTF-8

pandas: 0.18.0
nose: None
pip: 8.1.1
setuptools: 20.9.0
Cython: 0.24
numpy: 1.11.0
scipy: 0.17.0
statsmodels: 0.6.1
xarray: None
IPython: 4.1.2
sphinx: 1.4.1
patsy: 0.4.1
dateutil: 2.5.2
pytz: 2016.3
blosc: None
bottleneck: 1.0.0
tables: 3.2.2
numexpr: 2.5.2
matplotlib: 1.5.1
openpyxl: None
xlrd: None
xlwt: None
xlsxwriter: None
lxml: None
bs4: None
html5lib: None
httplib2: None
apiclient: None
sqlalchemy: None
pymysql: None
psycopg2: None
jinja2: 2.8
boto: 2.39.0
@jreback
Copy link
Contributor

jreback commented Apr 22, 2016

yeh must not be handling the columns in the expression, which is just pulls out anyhow. I'll mark it, but this is pretty tricky.

In [28]: store.select('dfq', columns=['A'], iterator=True)
Out[28]: <pandas.io.pytables.TableIterator at 0x114882810>

@jreback jreback added this to the Next Major Release milestone Apr 22, 2016
@cemsbr
Copy link
Contributor Author

cemsbr commented Apr 22, 2016

Thank you, it works this way. I saw the columns in the expression in the docs (e.g. "In [329]"). My code was taken from that page (I only added the chunksize keyword).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug IO HDF5 read_hdf, HDFStore
Projects
None yet
Development

No branches or pull requests

6 participants