How to perform a max_by window aggregation in Polars?
I am trying to use polars to do a window aggregate over one value, but map it back to another.
For example, if i wanted to get the name of the max value in a group, instead of (or in combination to) just the max value.
assuming an input of something like this.
df = pl.from_repr("""
┌───────┬──────┬───────┐
│ label ┆ name ┆ value │
│ --- ┆ --- ┆ --- │
│ str ┆ str ┆ f64 │
╞═══════╪══════╪═══════╡
│ a. ┆ foo ┆ 1.0 │
│ a. ┆ bar ┆ 2.0 │
│ b. ┆ baz ┆ 1.5 │
│ b. ┆ boo ┆ -1.0 │
└───────┴──────┴───────┘
""")
# 'max_by' is not a real method, just using it to express what i'm trying to achieve.
df.select(pl.col('label'), pl.col('name').max_by('value').over('label'))
i want an output like this
shape: (2, 2)
┌───────┬──────┐
│ label ┆ name │
│ --- ┆ --- │
│ str ┆ str │
╞═══════╪══════╡
│ a. ┆ bar │
│ b. ┆ baz │
└───────┴──────┘
ideally with the value too. But i know i can easily add that in via pl.col('value').max().over('label').
shape: (2, 3)
┌───────┬──────┬───────┐
│ label ┆ name ┆ value │
│ --- ┆ --- ┆ --- │
│ str ┆ str ┆ f64 │
╞═══════╪══════╪═══════╡
│ a. ┆ bar ┆ 2.0 │
│ b. ┆ baz ┆ 1.5 │
└───────┴──────┴───────┘