pandas - Remove duplicates indices in multiindex regardless of order -

May 15, 2010

take simple pd.series multi-index:

#create multiindex , data mult = pd.multiindex.from_product([[1,2,3],[1,2,3]],names=['factor1','factor2']) data = np.arange(1,4)*np.arange(1,4)[:,np.newaxis]  #create series ser = (pd.series(data.ravel(),                 index=mult,                 name='product')        .sort_values(ascending=false))  print(ser) factor1  factor2 3        3          9          2          6 2        3          6          2          4 3        1          3 1        3          3 2        1          2 1        2          2          1          1 name: product, dtype: int64

how can duplicate indexes, regardless of order, removed final series is

factor1  factor2 3        3          9          2          6 2        2          4 3        1          3 2        1          2 1        1          1 name: product, dtype: int64

the idea 2*3 , 3*2 same factors , want rid of one. i've tried drop_duplicates, eliminates duplicate products regardless of indices (so 1*0 , 2*0 considered duplicates).

hacky

ser[~pd.dataframe(np.sort(np.array(ser.index.tolist()), 1)).duplicated().values]  factor1  factor2 3        3          9          2          6 2        2          4 3        1          3 2        1          2 1        1          1 name: product, dtype: int64

Search This Blog

Insert

pandas - Remove duplicates indices in multiindex regardless of order -

Comments

Post a Comment

Popular posts from this blog

service - Android MediaPlayer calls onCompletion before it already finished -

javascript - Training Neural Network to play flappy bird with genetic algorithm - Why can't it learn? -

javascript - Create a stacked percentage column -