dplyr - R: Rolling sum on a non standard window -
i have irregular time series, , i'm trying account r rolling sum on 3 month window each operation associated id.
data structured follow
id operation date value 1 01/01/2017 0 2 01/02/2017 1 3 01/06/2017 1 4 01/09/2017 0 b 1 01/03/2017 0 b 2 01/05/2017 1 b 3 01/09/2017 0 b 4 01/10/2017 1
i'm looking output
id operation date value cumsum 1 01/01/2017 0 0 2 01/02/2017 1 1 3 01/06/2017 1 1 4 01/09/2017 0 1 b 1 01/03/2017 0 0 b 2 01/05/2017 1 1 b 3 01/09/2017 1 1 b 4 01/10/2017 1 2
now i'm using script
db<-db[with(db,order(id,date)),] db<-db %>% group_by(id) %>% mutate(cumsum = cumsum(value))
but sum value past operation. how can introduce 3 month rolling sum?
it's not possible flag in advance 3 month windows, because want go 3 months every date in dataset , means reference point (date) changes every time. therefore need function takes account , apply on every row.
library(lubridate) library(dplyr) # sample dataset dt = read.table(text="id operation date value 1 01/01/2017 0 2 01/02/2017 1 3 01/06/2017 1 4 01/09/2017 0 b 1 01/03/2017 0 b 2 01/05/2017 1 b 3 01/09/2017 1 b 4 01/10/2017 1", header=t, stringsasfactors=f) # function goes 3 months given date , given id f = function(id_input, date_input) { enddate = date_input startdate = date_input - months(3) sum((dt %>% filter(id == id_input & date >= startdate & date <= enddate))$value) } f = vectorize(f) # update date column dt$date = dmy(dt$date) # run function every row dt %>% mutate(sumvalue = f(id, date)) # id operation date value sumvalue # 1 1 2017-01-01 0 0 # 2 2 2017-02-01 1 1 # 3 3 2017-06-01 1 1 # 4 4 2017-09-01 0 1 # 5 b 1 2017-03-01 0 0 # 6 b 2 2017-05-01 1 1 # 7 b 3 2017-09-01 1 1 # 8 b 4 2017-10-01 1 2
Comments
Post a Comment