@@ -125,3 +125,79 @@ xarray objects do not yet support hierarchical indexes, so if your data has
125
125
a hierarchical index, you will either need to unstack it first or use the
126
126
:py:meth: `~xarray.DataArray.from_series ` or
127
127
:py:meth: `~xarray.Dataset.from_dataframe ` constructors described above.
128
+
129
+
130
+ Transitioning from pandas.Panel to xarray
131
+ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
132
+
133
+ :py:class: `~pandas.Panel `, pandas's data structure for 3D arrays, has always been a second class
134
+ data structure compared to the Series and DataFrame. To allow pandas developers to focus more on
135
+ its core functionality built around the DataFrame, pandas plans to eventually deprecate Panel.
136
+
137
+ xarray has most of ``Panel ``'s features, a more explicit API (particularly around
138
+ indexing), and the ability to scale to >3 dimensions with the same interface.
139
+
140
+ As discussed in the xarray docs, there are two primary data structures in xarray:
141
+ ``DataArray `` and ``Dataset ``. You can imagine a ``DataArray `` as a n-dimensional pandas
142
+ ``Series `` (i.e. a single typed array), and a ``Dataset `` as the ``DataFrame ``-equivalent
143
+ (i.e. a dict of aligned ``DataArray``s).
144
+
145
+ So you can represent a Panel, in two ways:
146
+ - A 3-dimenional ``DataArray ``
147
+ - A ``Dataset `` containing a number of 2-dimensional DataArray-s
148
+
149
+ .. ipython :: python
150
+ panel = pd.Panel(np.random.rand(2 , 3 , 4 ), items = list (' ab' ), major_axis = list (' mno' ),
151
+ minor_axis = pd.date_range(start = ' 2000' , periods = 4 , name = ' date' ))
152
+
153
+ panel
154
+
155
+
156
+ As a DataArray:
157
+
158
+
159
+ .. ipython :: python
160
+
161
+ xr.DataArray(panel)
162
+
163
+ Or:
164
+
165
+
166
+ .. ipython :: python
167
+
168
+ panel.to_xarray()
169
+
170
+
171
+ As you can see, there are three dimensions (each is also a coordinate). Two of the
172
+ axes of the panel were unnamed, so have been assigned `dim_0 ` & `dim_1 ` respectively,
173
+ while the third retains its name `date `.
174
+
175
+
176
+ As a Dataset:
177
+
178
+ .. ipython :: python
179
+ xr.Dataset(panel)
180
+
181
+ Here, there are two data variables, each representing a DataFrame on panel's `items `
182
+ axis, and labelled as such. Each variable is a 2D array of the respective values along
183
+ the `items ` dimension.
184
+
185
+ While the xarray docs are relatively complete, a few items stand out for Panel users:
186
+ - A DataArray's data is stored as a numpy array, and so can only contain a single
187
+ type. As a result, a Panel that contains :py:class: `~pandas.DataFrame`s with
188
+ multiple types will be converted to `object ` types. A ``Dataset `` of multiple ``DataArray``s
189
+ each with its own dtype will allow original types to be preserved
190
+ - Indexing is similar to pandas, but more explicit and leverages xarray's naming
191
+ of dimensions
192
+ - Because of those features, making much higher dimension-ed data is very practical
193
+ - Variables in ``Dataset``s can use a subset of its dimensions. For example, you can
194
+ have one dataset with Person x Score x Time, and another with Person x Score
195
+ - You can use coordinates are used for both dimensions and for variables which
196
+ _label_ the data variables, so you could have a coordinate Age, that labelled the
197
+ `Person` dimension of a DataSet of Person x Score x Time
198
+
199
+
200
+ While xarray may take some getting used to, it's worth it! If anything is unclear,
201
+ please post an issue on `GitHub <https://github.com/pydata/xarray>`__ or
202
+ `StackOverflow <http://stackoverflow.com/questions/tagged/python-xarray>`__,
203
+ and we'll endeavor to respond to the specific case or improve the general docs.
0 commit comments