Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ENH: Allow args to be specified for the pivot_table aggfunc #57884

Closed
1 of 3 tasks
j-kapp opened this issue Mar 18, 2024 · 6 comments · Fixed by #58893
Closed
1 of 3 tasks

ENH: Allow args to be specified for the pivot_table aggfunc #57884

j-kapp opened this issue Mar 18, 2024 · 6 comments · Fixed by #58893
Assignees
Labels
Enhancement Reshaping Concat, Merge/Join, Stack/Unstack, Explode
Milestone

Comments

@j-kapp
Copy link

j-kapp commented Mar 18, 2024

Feature Type

  • Adding new functionality to pandas

  • Changing existing functionality in pandas

  • Removing existing functionality in pandas

Problem Description

Currently the pivot_table has an aggfunc parameter which is used to do a groupby aggregation here. However, no additional arguments can be passed into that agg call. I'm specifically referring to the *args which can be specified in (df/series) groupby.agg function. It would be very useful if pivot_table could accept additional arguments for the aggfunc.

Feature Description

I'm not 100% sure, but I think it would be something like this:

  1. Add a **aggfunc_args or aggfunc_args: dict parameter to the pivot_table function.
  2. Do the same for the __internal_pivot_table function
  3. Change the line here, from agged = grouped.agg(aggfunc) to agged = grouped.agg(aggfunc, **aggfunc_args).
  4. Update the docs

Alternative Solutions

The same functionality can currently be achieved by specifying a custom function as aggfunc, but using that is much slower. My use case is pretty much the same as this.
Instead of using pd.pivot_table(... , aggfunc=lambda x: x.sum(min_count=1)), I would like to be able to do pd.pivot_table(... , aggfunc=sum, min_count=1) or similiar.

Additional Context

No response

@j-kapp j-kapp added Enhancement Needs Triage Issue that has not been reviewed by a pandas team member labels Mar 18, 2024
@rhshadrach
Copy link
Member

Thanks for the request - it makes sense to me to support passing through kwargs here. The only question is how:

def pivot_table(self, ..., **kwargs)
def pivot_table(self, ..., kwargs)
def pivot_table(self, ..., args, kwargs)

I do not think we should do

def pivot_table(self, *args, ..., **kwargs)

If we want to agree with the current direction of apply/agg/transform, it would be the first of the above three options. Ref: #40112

@rhshadrach rhshadrach added Reshaping Concat, Merge/Join, Stack/Unstack, Explode Needs Discussion Requires discussion from core team before further action and removed Needs Triage Issue that has not been reviewed by a pandas team member labels Apr 8, 2024
@PF2100
Copy link
Contributor

PF2100 commented May 18, 2024

take

@PF2100
Copy link
Contributor

PF2100 commented May 19, 2024

Hi, regarding that first option, would passing positional arguments to the aggfunc not be supported?
I must say I agree on the signature you proposed on the referenced issue ( func, args, kwargs, other_arg, sort of like the third option here) being the best overall, however, it seems like this is not the current standard for pandas.

@PF2100
Copy link
Contributor

PF2100 commented May 24, 2024

take

ruimamaral added a commit to PF2100/pandas that referenced this issue May 27, 2024
Add the option of passing keyword arguments to DataFrame.pivot_table
and pivot_table's aggfunc through **kwargs.

Co-authored-by: Pedro Freitas <pedrogmfreitas@tecnico.ulisboa.pt>
@PF2100
Copy link
Contributor

PF2100 commented May 27, 2024

Hi, regarding that first option, would passing positional arguments to the aggfunc not be supported? I must say I agree on the signature you proposed on the referenced issue ( func, args, kwargs, other_arg, sort of like the third option here) being the best overall, however, it seems like this is not the current standard for pandas.

@rhshadrach Sorry to bother you, but could you share your thoughts on this matter?

@rhshadrach
Copy link
Member

rhshadrach commented May 29, 2024

@PF2100 - thanks for the ping!

I must say I agree on the signature you proposed on the referenced issue ( func, args, kwargs, other_arg, sort of like the third option here) being the best overall, however, it seems like this is not the current standard for pandas.

Same here, but for consistency, it seems best to go with (1) for now. Changing from **kwargs to kwargs is likely to be quite noisy or users and I don't feel confident it will get enough support.

regarding that first option, would passing positional arguments to the aggfunc not be supported?

Correct.

ruimamaral added a commit to PF2100/pandas that referenced this issue Jun 2, 2024
Add the option of passing keyword arguments to DataFrame.pivot_table
and pivot_table's aggfunc through **kwargs.

Co-authored-by: Pedro Freitas <pedrogmfreitas@tecnico.ulisboa.pt>
@rhshadrach rhshadrach added this to the 3.0 milestone Jun 6, 2024
@rhshadrach rhshadrach removed the Needs Discussion Requires discussion from core team before further action label Jun 6, 2024
rhshadrach pushed a commit that referenced this issue Jun 6, 2024
…c keyword arguments #57884 (#58893)

Co-authored-by: Pedro Freitas <pedrogmfreitas@tecnico.ulisboa.pt>
Co-authored-by: Rui Amaral <rui.miguel.amaral@tecnico.ulisboa.pt>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Enhancement Reshaping Concat, Merge/Join, Stack/Unstack, Explode
Projects
None yet
3 participants