pandas - Remove duplicates indices in multiindex regardless of order -
take simple pd.series multi-index:
#create multiindex , data mult = pd.multiindex.from_product([[1,2,3],[1,2,3]],names=['factor1','factor2']) data = np.arange(1,4)*np.arange(1,4)[:,np.newaxis] #create series ser = (pd.series(data.ravel(), index=mult, name='product') .sort_values(ascending=false)) print(ser) factor1 factor2 3 3 9 2 6 2 3 6 2 4 3 1 3 1 3 3 2 1 2 1 2 2 1 1 name: product, dtype: int64 how can duplicate indexes, regardless of order, removed final series is
factor1 factor2 3 3 9 2 6 2 2 4 3 1 3 2 1 2 1 1 1 name: product, dtype: int64 the idea 2*3 , 3*2 same factors , want rid of one. i've tried drop_duplicates, eliminates duplicate products regardless of indices (so 1*0 , 2*0 considered duplicates).
hacky
ser[~pd.dataframe(np.sort(np.array(ser.index.tolist()), 1)).duplicated().values] factor1 factor2 3 3 9 2 6 2 2 4 3 1 3 2 1 2 1 1 1 name: product, dtype: int64
Comments
Post a Comment