python - assign a column in a data frame based on another column -
i have following dataframe example
univ date ms kv 11/01/2007 1 0.2 11/02/2007 0 0.3 11/03/2007 1 0.4 11/05/2007 1 0.1 b 11/01/2007 0 0.11 b 11/03/2007 1 0.12 b 11/04/2007 1 0.13
for each univ group, calculate average of kv, next available date after ms = 1. in above case a, ms = 1 on 11/01 , 11/03 , 11/05 output should be
univ kv 0.2 ( average of 0.3 , 0.1)
i make "next available date" flexible "the second next or third next available date"
thanks much!
iiuc:
in [244]: n=1 in [245]: df.groupby('univ') \ .apply(lambda x: x.loc[x.ms.shift(n)==1, 'kv'].mean()) \ .reset_index(name='kv') out[245]: univ kv 0 0.20 1 b 0.13
in [246]: n=2 in [247]: df.groupby('univ') \ .apply(lambda x: x.loc[x.ms.shift(n)==1, 'kv'].mean()) \ .reset_index(name='kv') out[247]: univ kv 0 0.4 1 b nan
Comments
Post a Comment