python - Pandas - Replacing NaN by aggregate of non-null values -
suppose have dataframe nan -
import pandas pd l = [{'c1':-6,'c3':2}, {'c2':-6,'c3':3}, {'c1':-6.3,'c2':8,'c3':9}, {'c2':-7}] df1 = pd.dataframe(l, index=['r1','r2','r3','r4']) print(df1) c1 c2 c3 r1 -6.0 nan 2.0 r2 nan -6.0 3.0 r3 -6.3 8.0 9.0 r4 nan -7.0 nan
problem - if there nan value in row cell has replaced aggregate of non-null values same row. instance, in first row, value of (r1,c2) should = (-6+2)/2 = -2
expected output -
c1 c2 c3 r1 -6.0 -4.0 2.0 r2 -1.5 -6.0 3.0 r3 -6.3 8.0 9.0 r4 -7.0 -7.0 -7.0
use apply
axis=1
process rows:
df1 = df1.apply(lambda x: x.fillna(x.mean()), axis=1) print(df1) c1 c2 c3 r1 -6.0 -2.0 2.0 r2 -1.5 -6.0 3.0 r3 -6.3 8.0 9.0 r4 -7.0 -7.0 -7.0
also works:
df1 = df1.t.fillna(df1.mean(1)).t print(df1) c1 c2 c3 r1 -6.0 -2.0 2.0 r2 -1.5 -6.0 3.0 r3 -6.3 8.0 9.0 r4 -7.0 -7.0 -7.0
because:
df1 = df1.fillna(df1.mean(1), axis=1) print(df1)
notimplementederror: can fill dict/series column column
Comments
Post a Comment