Skip to content

DOC: Restructure and expand UDF page #61470

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

datapythonista
Copy link
Member

@datapythonista datapythonista commented May 21, 2025

I changed the order in which the methods are presented,both in the table and in the sections, to be:

  • map
  • apply
  • pipe
  • filter
  • agg
  • transform

I find it easier to explain them in this order.

And I expanded the method sections with examples and a bit more of information.

I removed the most complex example in the intro, as I think the examples in the sections will make a better job now at explaining the most complex cases.

@arthurlw @rhshadrach do you mind having a look?

@datapythonista datapythonista added Docs Apply Apply, Aggregate, Transform, Map labels May 21, 2025
@@ -118,101 +104,229 @@ decisions, ensuring more efficient and maintainable code.
and :ref:`ewm()<window>` for details.


:meth:`DataFrame.apply`
~~~~~~~~~~~~~~~~~~~~~~~
.. _udf.map:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we plan to use udf as the reference, then we should rename the reference on the top of the file from:

.. _user_defined_functions:

to:

.. _udf:

df_filtered = df.filter(items=[col for col in df.columns if is_long_name(col)])
print(df_filtered)
temperature.apply(highest_jump)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
temperature.apply(highest_jump)
temperature.agg(highest_jump)

@arthurlw
Copy link
Member

Looks good to me! I think the example under vectorized operations should be changed to fit with the Fahrenheit example, but that can be added in a follow-up PR.

@datapythonista
Copy link
Member Author

Thanks @arthurlw, great feedback. I'll leave the example on the vectorized section for now, as it may make sense to also expand that section as we make progress with the IT engines. Feel free to update it now if you want, but I'm unsure at this point how to add the JIT engines to that section, and how to better present all the performance related topics. Maybe we can just add a section for it, but maybe we can find a way to present it so one topic expands on the previous, as I tried to do with the different methods.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Apply Apply, Aggregate, Transform, Map Docs
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants