r - Filtering multiple csv files while importing into data frame -

June 15, 2013

i have large number of csv files want read r. column headings in csvs same. want import rows each file data frame variable within given range (above min threshold & below max threshold), e.g.

   v1   v2   v3 1  x    q    2 2  c    w    4 3  v    e    5 4  b    r    7

filtering v3 (v3>2 & v3<7) should results in:

   v1   v2   v3 1  c    w    4 2  v    e    5

so fare import data csvs 1 data frame , filtering:

#read data files filenames <- list.files(path = workdir) mergedfiles <- do.call("rbind", sapply(filenames, read.csv, simplify = false)) fileid <- row.names(mergedfiles) fileid <- gsub(".csv.*", "", fileid) #combining data file ids combfiles=cbind(fileid, mergedfiles) #filtering data according criteria resultfile <- combfiles[combfiles$v3 > min & combfiles$v3 < max, ]

i rather apply filter while importing each single csv file data frame. assume loop best way of doing it, not sure how. appreciate suggestion.

edit

after testing suggestion mnel, worked, ended different solution:

filenames = list.files(path = workdir) mzlist = list() for(i in 1:length(filenames)){ tempdata = read.csv(filenames[i]) mz.idx = which(tempdata[ ,1] > minmz & tempdata[ ,1] < maxmz) mz1 = tempdata[mz.idx, ] mzlist[[i]] = data.frame(mz1, filename = rep(filenames[i], length(mz.idx))) } resultfile = do.call("rbind", mzlist)

thanks suggestions!

here approach using data.table allow use fread (which faster read.csv) , rbindlist superfast implementation of do.call(rbind, list(..)) perfect situation. has function between

library(data.table) filenames <- list.files(path = workdir) alldata <- rbindlist(lapply(filenames, function(x,mon,max) {   xx <- fread(x, sep = ',')   xx[, fileid :=   gsub(".csv.*", "", x)]   xx[between(v3, lower=min, upper = max, incbounds = false)]   }, min = 2, max = 3))

if individual files large , v1 integer values might worth setting v3 key using binary search, may quicker import , run filtering.

Search This Blog

Insert

r - Filtering multiple csv files while importing into data frame -

Comments

Post a Comment

Popular posts from this blog

vue.js - Create hooks for automated testing -

php - Vagrant up error - Uncaught Reflection Exception: Class DOMDocument does not exist -

serial port - hub4com OVERRUN Error -