r - take randomly sample based on groups of another dataset with no match cases -
i have 2 datasets these ones:
df <- data.frame(id = 1:20, sex = rep(x = c(0,1), each=10), age = c(25,56,29,42,33,33,33,25,25,25,26,57,30,43,34,34,34,26,26,26), ov = letters[1:20]) df1 <- data.frame(sex = c(0,0,0,1,1), age = c(25,33,39,41,43))
i want take 1 random row every group of sex , age of df according every group of df1, not cases of age in df1 match in df, want impute every group in df1 no match in df value of var ov related same sex , closest age, this:
df3 <- rbind(df[c(8,7),2:4],c(0,39,"d"),c(1,41,"n"),df[14,2:4])
note donor case in sex = 0 , age = 39 df[4,] , note donor case in sex = 1 , age = 41 df[14,]
how can this:
using data.table
can try this:
1) convert data data.table
, add keys:
df1 dt1 <- as.data.table(df1) # convert data.table dt1[, newsex := sex] # serve grouping column dt1[, newage := age] # setkey(dt1, sex, age) # set data.tables keys dt1 sex age newsex newage 1: 0 25 0 25 2: 0 33 0 33 3: 0 39 0 39 4: 1 41 1 41 5: 1 43 1 43 # similar df: dt <- as.data.table(df) setkey(dt, sex, age) dt id sex age ov 1: 1 0 25 2: 8 0 25 h 3: 9 0 25 4: 10 0 25 j 5: 3 0 29 c 6: 5 0 33 e 7: 6 0 33 f 8: 7 0 33 g 9: 4 0 42 d 10: 2 0 56 b 11: 11 1 26 k 12: 18 1 26 r 13: 19 1 26 s 14: 20 1 26 t 15: 13 1 30 m 16: 15 1 34 o 17: 16 1 34 p 18: 17 1 34 q 19: 14 1 43 n 20: 12 1 57 l
2) using rolling merge dtnew
new groups:
dtnew <- dt1[dt, roll = "nearest"] dtnew sex age newsex newage id ov 1: 0 25 0 25 1 2: 0 25 0 25 8 h 3: 0 25 0 25 9 4: 0 25 0 25 10 j 5: 0 29 0 25 3 c 6: 0 33 0 33 5 e 7: 0 33 0 33 6 f 8: 0 33 0 33 7 g 9: 0 42 0 39 4 d 10: 0 56 0 39 2 b 11: 1 26 1 41 11 k 12: 1 26 1 41 18 r 13: 1 26 1 41 19 s 14: 1 26 1 41 20 t 15: 1 30 1 41 13 m 16: 1 34 1 41 15 o 17: 1 34 1 41 16 p 18: 1 34 1 41 17 q 19: 1 43 1 43 14 n 20: 1 57 1 43 12 l
3) can sample. in case can reorder rows in random order, , take firs row of each group:
dtnew <- dtnew[sample(.n)] #create random order sampledt <- unique(dtnew, = c("newsex", "newage")) #take first unique newsex , newage sampledt sex age newsex newage id ov 1: 0 56 0 39 2 b 2: 0 29 0 25 3 c 3: 1 43 1 43 14 n 4: 1 34 1 41 16 p 5: 0 33 0 33 7 g
Comments
Post a Comment