python - How to use nlargest on multilevel pivot_table in pandas? -


i'm summing values in pivot table using pandas.

dfr = pd.dataframe({'a':  [1,1,1,1,2,2,2,2],                      'b':  [1,2,2,3,1,2,2,2],                      'c':  [1,1,1,2,1,1,2,2],                      'val':[1,1,1,1,1,1,1,1]}) dfr = dfr.pivot_table(values='val', index=['a', 'b', 'c'], aggfunc=np.sum) dfr 

output:

a   b   c   |val ------------|--- 1   1   1   |1     2   1   |2     3   2   |1 2   1   1   |1     2   1   |1         2   |2 

the way need output show largest in each group a, this:

a   b   c   |val ------------|--- 1   2   1   |2 2   2   2   |2 

i've googled bit around , tried using nlargest() in different ways without being able produce result want. got ideas?

i think need groupby + nlargest level a:

dfr = dfr.pivot_table(values='val', index=['a', 'b', 'c'], aggfunc=np.sum) dfr = dfr.groupby(level='a')['val'].nlargest(1).reset_index(level=0, drop=true).reset_index() print (dfr)     b  c  val 0  1  2  1    2 1  2  2  2    2 

because if use pivot_table levels lost:

dfr = dfr.pivot_table(values='val', index=['a', 'b', 'c'], aggfunc=np.sum).reset_index() dfr = dfr.pivot_table(values='val', index='a', aggfunc=lambda x: x.nlargest(1)) print (dfr)    val      1    2 2    2 

and if use levels return nlrgest levels (not want)

dfr = dfr.pivot_table(values='val', index=['a', 'b', 'c'], aggfunc=np.sum).reset_index() dfr = dfr.pivot_table(values='val', index=['a', 'b', 'c'], aggfunc=lambda x: x.nlargest(1)) print (dfr)        val b c      1 1 1    1   2 1    2   3 2    1 2 1 1    1   2 1    1     2    2 

Comments

Popular posts from this blog

php - Vagrant up error - Uncaught Reflection Exception: Class DOMDocument does not exist -

vue.js - Create hooks for automated testing -

Add new key value to json node in java -