@@ -2170,37 +2170,45 @@ multiple tables at once. The idea is to have one table (call it the
2170
2170
selector table) that you index most/ all of the columns, and perform your
2171
2171
queries. The other table(s) are data tables with an index matching the
2172
2172
selector table' s index. You can then perform a very fast query
2173
- on the selector table, yet get lots of data back. This method works similar to
2174
- having a very wide table, but is more efficient in terms of queries.
2173
+ on the selector table, yet get lots of data back. This method is similar to
2174
+ having a very wide table, but enables more efficient queries.
2175
2175
2176
- Note, ** THE USER IS RESPONSIBLE FOR SYNCHRONIZING THE TABLES ** . This
2177
- means, append to the tables in the same order; `` append_to_multiple``
2178
- splits a single object to multiple tables, given a specification (as a
2179
- dictionary). This dictionary is a mapping of the table names to the
2180
- ' columns' you want included in that table. Pass a `None ` for a single
2181
- table (optional) to let it have the remaining columns. The argument
2182
- `` selector`` defines which table is the selector table.
2176
+ The `` append_to_multiple`` method splits a given single DataFrame
2177
+ into multiple tables according to `` d`` , a dictionary that maps the
2178
+ table names to a list of ' columns' you want in that table. If `None `
2179
+ is used in place of a list , that table will have the remaining
2180
+ unspecified columns of the given DataFrame. The argument `` selector``
2181
+ defines which table is the selector table (which you can make queries from ).
2182
+ The argument `` dropna`` will drop rows from the input DataFrame to ensure
2183
+ tables are synchronized. This means that if a row for one of the tables
2184
+ being written to is entirely `` np.NaN`` , that row will be dropped from all tables.
2185
+
2186
+ If `` dropna`` is False , ** THE USER IS RESPONSIBLE FOR SYNCHRONIZING THE TABLES ** .
2187
+ Remember that entirely `` np.Nan`` rows are not written to the HDFStore, so if
2188
+ you choose to call `` dropna = False `` , some tables may have more rows than others,
2189
+ and therefore `` select_as_multiple`` may not work or it may return unexpected
2190
+ results.
2183
2191
2184
2192
.. ipython:: python
2185
2193
2186
2194
df_mt = DataFrame(randn(8 , 6 ), index = date_range(' 1/1/2000' , periods = 8 ),
2187
2195
columns = [' A' , ' B' , ' C' , ' D' , ' E' , ' F' ])
2188
2196
df_mt[' foo' ] = ' bar'
2197
+ df_mt.ix[1 , (' A' , ' B' )] = np.nan
2189
2198
2190
2199
# you can also create the tables individually
2191
2200
store.append_to_multiple({' df1_mt' : [' A' , ' B' ], ' df2_mt' : None },
2192
2201
df_mt, selector = ' df1_mt' )
2193
2202
store
2194
2203
2195
- # indiviual tables were created
2204
+ # individual tables were created
2196
2205
store.select(' df1_mt' )
2197
2206
store.select(' df2_mt' )
2198
2207
2199
2208
# as a multiple
2200
2209
store.select_as_multiple([' df1_mt' , ' df2_mt' ], where = [' A>0' , ' B>0' ],
2201
2210
selector = ' df1_mt' )
2202
2211
2203
- .. _io.hdf5- delete:
2204
2212
2205
2213
Delete from a Table
2206
2214
~~~~~~~~~~~~~~~~~~~
0 commit comments