r - Calculation of date difference (BUG?) for POSIXct columns -


i using code difference in hours 2 posixct dates.

x <- transform(x, hrs = ceiling(as.numeric(ship_date-pick_date))) 

this gives accurate results. however, when tried find hour differences similar column, needed this:

x <- transform(x, hrs_adj = ceiling(as.numeric(ship_date-adj_pick_date)/60)) 

pick_date & ship_date extracted using same formula.

x$ship_date <- ifelse(is.na(as.posixct(x$ship_date, format="%d-%b-%y %h:%m %p")),                       yes = as.posixct(x$ship_date, format="%d-%b-%y %h:%m"),                       no = as.posixct(x$ship_date, format="%d-%b-%y %h:%m %p")) x$ship_date <- as.posixct(x$ship_date, origin = "1970-01-01") 

adj_pick_date computed below:

x$adj_pick_date <- ifelse(x$pick_time=="early",                           as.posixct(paste(format(x$pick_date, "%d-%b-%y"), "03:00"),                                      format="%d-%b-%y %h:%m"), x$pick_date) x$adj_pick_date <- ifelse(x$pick_time=="late",                           as.posixct(paste(format(x$pick_date+86400, "%d-%b-%y"),                                            "03:00"), format="%d-%b-%y %h:%m"),                           x$adj_pick_date) x$adj_pick_date <- as.posixct(x$adj_pick_date, origin = "1970-01-01") 

pick_time computed adjust pick_date, orders between 16:00 & 03:00, lead time calculated 3am.

questions:

  1. how efficiently generate adj_pick_date column (now slow)?
  2. how extract source data posixct using shorter , more efficient code? (it takes 10-15 seconds per million data point on i7 7th gen cpu)
  3. why did need use different formula each pair of dates calculate no of days?

sample data (the dates formatted randomly in source (pick_date & ship_date) both "dd-mmm-yyyy hh:mm" , "dd-mmm-yyyy hh:mm am/pm"):

pick_date    ship_date    pick_time   01-apr-2017 00:51    02-apr-2017 06:55      01-apr-2017 00:51    02-apr-2017 12:11 pm      01-apr-2017 07:51    02-apr-2017 12:11 pm    okay   01-apr-2017 02:51 pm    02-apr-2017 09:39    late   

ok, got of solutions now.

  1. using lubridate package, method takes 50% processing time:
x$adj_pick_date <- ifelse(x$pick_time=="early",                                   dmy_hm(paste(format(x$pick_date, "%d-%b-%y"), "03:00")),                                   ifelse(x$pick_time=="late",                                          dmy_hm(paste(format(x$pick_date+86400, "%d-%b-%y"),                                                       "03:00")), x$pick_date))         x$adj_pick_date <- as.posixct(x$adj_pick_date, origin = "1970-01-01") 
  1. again, using lubridate:
x$ship_date <- lubridate::dmy_hm(x$ship_date) x$pick_date <- lubridate::dmy_hm(x$pick_date) 
  1. probably formatting error while doing conversion. still need on problem.

Comments

Popular posts from this blog

php - Vagrant up error - Uncaught Reflection Exception: Class DOMDocument does not exist -

vue.js - Create hooks for automated testing -

Add new key value to json node in java -