r - How can I remove all cells with "NA" value by columns -
this question not duplicate because data.frame
does not have same amount of na
values in columns , therefore solution mentioned in question not work.
i have data.frame
lot of na
values , delete cells (important: not rows or columns, cells) have na values. original this:
a b 1 na na 2 2 na na na na na na 4 3 5
the desired result this:
a b 1 2 2 4 3 5
the number of columns have stay same, not matter if values remain on same rows. can moved up.
i image 1 delete cells condition na (maybe apply) , result. or maybe simple sorting ?
thanks.
update:
a b c 1 3 2 4 3 1 2 3 5 4 9 7 1
the op has requested remove na
s columnwise has pointed out there might different numbers of na in each column.
this can solved using data.table
in 2 steps:
library(data.table) # step 1: coerce data.table in place, move nas bottom of each column, # maintain original order of non-na values result <- data.table(df)[, lapply(.sd, function(x) x[order(is.na(x))])]
b c 1: 1 2 3 2: 4 1 3 3: 3 9 2 4: 7 na 5 5: na na 4 6: na na 1 7: na na na 8: na na na 9: na na na 10: na na na
# step 2: trim result # either using reduce result[!result[, reduce(`&`, lapply(.sd, is.na))]] # or using zoo::na.trim() zoo::na.trim(result, is.na = "all")
b c 1: 1 2 3 2: 4 1 3 3: 3 9 2 4: 7 na 5 5: na na 4 6: na na 1
so, there na
s @ end of each colummn unavoidably because columns in data.frame have same length.
or, alternatively, complete rows can kept using is.na
parameter na.trim()
:
zoo::na.trim(result, is.na = "any")
b c 1: 1 2 3 2: 4 1 3 3: 3 9 2
an alternative solution
as mentioned before, data.frame
s , cbind()
expect column vectors have same length. here alternative solution without data.table
uses cbind.fill()
function rows
package pads vectors fill
value until same length:
setnames(do.call(function(...) rowr::cbind.fill(..., fill = na), lapply(df, na.omit)), colnames(df))
b c 1 1 2 3 2 4 1 3 3 3 9 2 4 7 na 5 5 na na 4 6 na na 1
data
as supplied op in update:
df <- structure(list(a = c(1l, na, 4l, na, na, na, 3l, na, na, 7l), b = c(na, 2l, na, na, 1l, na, na, na, 9l, na), c = c(3l, na, 3l, na, 2l, na, 5l, 4l, na, 1l)), .names = c("a", "b", "c"), row.names = c(na, -10l), class = "data.frame")
Comments
Post a Comment