Counting the observations through each path of a data.tree in R -


using data.tree build custom hierarchy, i'm looking count number of observations run through each node.

library(mass) library(data.tree)  data(cars93)  cars93 <- subset(cars93, manufacturer %in% c("acura","toyota"))[, c("manufacturer","drivetrain","passengers")]  > cars93        manufacturer drivetrain passengers 1         acura      front          5 2         acura      front          5 84       toyota      front          5 85       toyota      front          4 86       toyota      front          5 87       toyota        4wd          7 

the current output adding children correctly first sub node, skips "drivetrain" column "acura" level, , "toyota" level stopped adding "passengers" children after first iteration.

  levelname       obs.ct 1 cars               6 2  ¦--acura          2 3  ¦   °--5          2 4  °--toyota         4 5      ¦--4wd        1 6      ¦   °--7      1 7      °--front      3 

all of built-in counting functions appear apply node , leaf levels, not observation levels, i'm not missing there. building tree data frame 1 node @ time , counting rows solution i've come across.

i've come close updating training code https://cran.r-project.org/web/packages/data.tree/vignettes/applications.html#id3-introduction, breaks somewhere between splitting on each feature , recursively calling function each child. i've tried sapply'ing split of features @ once, results adding children wrong levels of hierarchy. closest i've been able align output.

ispure <- function(data) {     length(unique(data[, ncol(data)])) == 1 }  path_func <- function(node, data) {     node$obs.ct <- nrow(data)      if (ispure(data)) {         child <- node$addchild(unique(data[, ncol(data)]))         child$obs.ct <- nrow(data)      } else {         childobs <- split(data[ , 2:ncol(data), drop = false], data[ , 1], drop = true)          for(i in 1:length(childobs)) {             child <- node$addchild(names(childobs)[i])             path_func(child, childobs[[i]])         }     } }  tree <- node$new("cars") path_func(tree, cars93) print(tree, "obs.ct") 


Comments

Popular posts from this blog

javascript - Create a stacked percentage column -

Optimising Firebase database by automatically overwriting data -

javascript - Angular UI-Grid customTemplate directive causing rows to load slowly/? -