r - Create a co-occurrence matrix from a .csv -
i'm trying create co-occurrence matrix see keywords associated in database.
the data looks this, it's .csv file.
id, keywords 1, apple;pear 2, apple;cherry 3, pear;cherry 4, apple;cherry and obtain this
apple pear cherry apple 0 1 2 pear 1 0 1 cherry 2 1 0 the goal use d3.js visualize matrix.
i've posted in r tag because i've used bit before classes, i'm not complete newbie. saw while looking solutions it's possible use python this, never touched in life.
you can use tidyr (and magrittr) package(s) , table function.
library(tidyr) library(magrittr) df <- data.frame(id = 1:4, keywords = c("apple;pear", "apple;cherry", "pear;cherry", "apple;cherry")) df2 <- df %>% separate(keywords, sep = ";", = c("f1", "f2")) this have correct levels in row/column names.
df2$f1 %<>% factor() df2$f2 %<>% factor() df2$f1 <- factor(df2$f1, levels = unique(c(levels(df2$f1), levels(df2$f2)))) df2$f2 <- factor(df2$f2, levels = unique(c(levels(df2$f1), levels(df2$f2)))) you can use table (it's not symmetric use +)
> table(df2$f1, df2$f2) + table(df2$f2, df2$f1) apple pear cherry apple 0 1 2 pear 1 0 1 cherry 2 1 0
Comments
Post a Comment