@@ -702,7 +702,8 @@ on an entire ``DataFrame`` or ``Series``, row- or column-wise, or elementwise.
702702
7037031. `Tablewise Function Application `_: :meth: `~DataFrame.pipe `
7047042. `Row or Column-wise Function Application `_: :meth: `~DataFrame.apply `
705- 3. Elementwise _ function application: :meth: `~DataFrame.applymap `
705+ 3. `Aggregation API `_: :meth: `~DataFrame.agg ` and :meth: `~DataFrame.transform `
706+ 4. `Applying Elementwise Functions `_: :meth: `~DataFrame.applymap `
706707
707708.. _basics.pipe :
708709
@@ -778,6 +779,13 @@ statistics methods, take an optional ``axis`` argument:
778779 df.apply(np.cumsum)
779780 df.apply(np.exp)
780781
782+ ``.apply() `` will also dispatch on a string method name.
783+
784+ .. ipython :: python
785+
786+ df.apply(' mean' )
787+ df.apply(' mean' , axis = 1 )
788+
781789 Depending on the return type of the function passed to :meth: `~DataFrame.apply `,
782790the result will either be of lower dimension or the same dimension.
783791
@@ -827,16 +835,212 @@ set to True, the passed function will instead receive an ndarray object, which
827835has positive performance implications if you do not need the indexing
828836functionality.
829837
830- .. seealso ::
838+ .. _basics.aggregate :
839+
840+ Aggregation API
841+ ~~~~~~~~~~~~~~~
842+
843+ .. versionadded :: 0.20.0
844+
845+ The aggregation API allows one to express possibly multiple aggregation operations in a single concise way.
846+ This API is similar across pandas objects, :ref: `groupby aggregates <groupby.aggregate >`,
847+ :ref: `window functions <stats.aggregate >`, and the :ref: `resample API <timeseries.aggregate >`.
848+
849+ We will use a similar starting frame from above.
850+
851+ .. ipython :: python
852+
853+ tsdf = pd.DataFrame(np.random.randn(10 , 3 ), columns = [' A' , ' B' , ' C' ],
854+ index = pd.date_range(' 1/1/2000' , periods = 10 ))
855+ tsdf.iloc[3 :7 ] = np.nan
856+ tsdf
857+
858+ Using a single function is equivalent to ``.apply ``; You can also pass named methods as strings.
859+ This will return a Series of the output.
860+
861+ .. ipython :: python
862+
863+ tsdf.agg(np.sum)
864+
865+ tsdf.agg(' sum' )
866+
867+ # these are equivalent to a ``.sum()`` because we are aggregating on a single function
868+ tsdf.sum()
869+
870+ On a Series this will result in a scalar value
871+
872+ .. ipython :: python
873+
874+ tsdf.A.agg(' sum' )
875+
876+
877+ Aggregating multiple functions at once
878+ ++++++++++++++++++++++++++++++++++++++
879+
880+ You can pass arguments as a list. The results of each of the passed functions will be a row in the resultant DataFrame.
881+ These are naturally named from the aggregation function.
882+
883+ .. ipython :: python
884+
885+ tsdf.agg([' sum' ])
886+
887+ Multiple functions yield multiple rows.
888+
889+ .. ipython :: python
890+
891+ tsdf.agg([' sum' , ' mean' ])
892+
893+ On a Series, multiple functions return a Series.
894+
895+ .. ipython :: python
896+
897+ tsdf.A.agg([' sum' , ' mean' ])
898+
899+
900+ Aggregating with a dict of functions
901+ ++++++++++++++++++++++++++++++++++++
902+
903+ Passing a dictionary of column name to function or list of functions, to ``DataFame.agg ``
904+ allows you to customize which functions are applied to which columns.
905+
906+ .. ipython :: python
907+
908+ tsdf.agg({' A' : ' mean' , ' B' : ' sum' })
909+
910+ Passing a list-like will generate a DataFrame output. You will get a matrix-like output
911+ of all of the aggregators; some may be missing values.
912+
913+ .. ipython :: python
914+
915+ tsdf.agg({' A' : [' mean' , ' min' ], ' B' : ' sum' })
916+
917+ For a Series, you can pass a dict; the keys will set the name of the column
918+
919+ .. ipython :: python
920+
921+ tsdf.A.agg({' foo' : [' sum' , ' mean' ]})
922+
923+ Alternatively, using multiple dictionaries, you can have renamed elements with the aggregation
924+
925+ .. ipython :: python
926+
927+ tsdf.A.agg({' foo' : ' sum' , ' bar' :' mean' })
928+
929+ Multiple keys will yield multiple columns.
930+
931+ .. ipython :: python
932+
933+ tsdf.A.agg({' foo' : [' sum' , ' mean' ], ' bar' : [' min' , ' max' , lambda x : x.sum()+ 1 ]})
934+
935+ .. _basics.custom_describe :
936+
937+ Custom describe
938+ +++++++++++++++
939+
940+ With ``.agg() `` is it possible to easily create a custom describe function, similar
941+ to the built in :ref: `describe function <basics.describe >`.
942+
943+ .. ipython :: python
944+
945+ from functools import partial
946+
947+ q_25 = partial(pd.Series.quantile, q = 0.25 )
948+ q_25.__name__ = ' 25%'
949+ q_75 = partial(pd.Series.quantile, q = 0.75 )
950+ q_75.__name__ = ' 75%'
951+
952+ tsdf.agg([' count' , ' mean' , ' std' , ' min' , q_25, ' median' , q_75, ' max' ])
953+
954+ .. _basics.transform :
955+
956+ Transform API
957+ ~~~~~~~~~~~~~
958+
959+ .. versionadded :: 0.20.0
960+
961+ The ``transform `` method returns an object that is indexed the same (same size)
962+ as the original. This API allows you to provide *multiple * operations at the same
963+ time rather than one-by-one. Its api is quite similar to the ``.agg `` API.
964+
965+ Use a similar frame to the above sections.
831966
832- The section on :ref: `GroupBy <groupby >` demonstrates related, flexible
833- functionality for grouping by some criterion, applying, and combining the
834- results into a Series, DataFrame, etc.
967+ .. ipython :: python
968+
969+ tsdf = pd.DataFrame(np.random.randn(10 , 3 ), columns = [' A' , ' B' , ' C' ],
970+ index = pd.date_range(' 1/1/2000' , periods = 10 ))
971+ tsdf.iloc[3 :7 ] = np.nan
972+ tsdf
973+
974+ Transform the entire frame. Transform allows functions to input as a numpy function, string
975+ function name and user defined function.
976+
977+ .. ipython :: python
835978
836- .. _Elementwise :
979+ tsdf.transform(np.abs)
980+ tsdf.transform(' abs' )
981+ tsdf.transform(lambda x : x.abs())
982+
983+ Since this is a single function, this is equivalent to a ufunc application
984+
985+ .. ipython :: python
986+
987+ np.abs(tsdf)
988+
989+ Passing a single function to ``.transform() `` with a Series will yield a single Series in return.
990+
991+ .. ipython :: python
837992
838- Applying elementwise Python functions
839- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
993+ tsdf.A.transform(np.abs)
994+
995+
996+ Transform with multiple functions
997+ +++++++++++++++++++++++++++++++++
998+
999+ Passing multiple functions will yield a column multi-indexed DataFrame.
1000+ The first level will be the original frame column names; the second level
1001+ will be the names of the transforming functions.
1002+
1003+ .. ipython :: python
1004+
1005+ tsdf.transform([np.abs, lambda x : x+ 1 ])
1006+
1007+ Passing multiple functions to a Series will yield a DataFrame. The
1008+ resulting column names will be the transforming functions.
1009+
1010+ .. ipython :: python
1011+
1012+ tsdf.A.transform([np.abs, lambda x : x+ 1 ])
1013+
1014+
1015+ Transforming with a dict of functions
1016+ +++++++++++++++++++++++++++++++++++++
1017+
1018+
1019+ Passing a dict of functions will will allow selective transforming per column.
1020+
1021+ .. ipython :: python
1022+
1023+ tsdf.transform({' A' : np.abs, ' B' : lambda x : x+ 1 })
1024+
1025+ Passing a dict of lists will generate a multi-indexed DataFrame with these
1026+ selective transforms.
1027+
1028+ .. ipython :: python
1029+
1030+ tsdf.transform({' A' : np.abs, ' B' : [lambda x : x+ 1 , ' sqrt' ]})
1031+
1032+ On a Series, passing a dict allows renaming as in ``.agg() ``
1033+
1034+ .. ipython :: python
1035+
1036+ tsdf.A.transform({' foo' : np.abs})
1037+ tsdf.A.transform({' foo' : np.abs, ' bar' : [lambda x : x+ 1 , ' sqrt' ]})
1038+
1039+
1040+ .. _basics.elementwise :
1041+
1042+ Applying Elementwise Functions
1043+ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
8401044
8411045Since not all functions can be vectorized (accept NumPy arrays and return
8421046another array or value), the methods :meth: `~DataFrame.applymap ` on DataFrame
0 commit comments