sas - Sum consecutive observations by some variable -
i'm learning use sas, bear me bit. have following sample patient data on prescription usage , i'd try combine observations form more of patient story, keep timeline intact:
data have; input dose $2. id $4. supply date $8.; datalines; "5" 1234 30 01012015 "10" 1234 30 02012015 "10" 1234 30 03012015 "5" 1234 30 04012015 "2" 1234 30 05012015 "5" 4321 30 07012016 "2" 9876 30 05012016 "2" 9876 30 06012016 "10" 9876 30 07012016 ; run;
where dose dosage in mg, id patient id, supply number of days' supply of medication, , date date of refill.
i'd consolidate of observations when @ patient 1234 can see taking 5mg 30 days, 10mg 60 days, 5 mg again 30 days, etc. of summation , group commands i've learned combine observations 1 , 4 together, patient story dosage increased , decreased, , i'd keep intact don't know how.
so this:
data want; input dose $2. id $4. supply date $8.; datalines; "5" 1234 30 01012015 "10" 1234 60 02012015 "5" 1234 30 04012015 "2" 1234 30 05012015 "5" 4321 30 07012016 "2" 9876 60 05012016 "10" 9876 30 07012016 ; run;
see observation 3 rolled 2, 8 7, etc.
any tips appreciated!
here 1 solution relying on retain variables. 1 among many, , uses rather advanced techniques scare crap out of beginner. have been warned ;)
the use of goto
& labels (ending :
) not common , in most cases can avoided. in situation this, seems warranted, concision.
data have; informat id 4. dose 3. supply 3. date mmddyy8.; format date mmddyy10.; input id dose supply date; datalines; 1234 5 30 01012015 1234 10 30 02012015 1234 10 30 03012015 1234 5 30 04012015 1234 2 30 05012015 4321 5 30 07012016 9876 2 30 05012016 9876 2 30 06012016 9876 10 30 07012016 ;
we first make sure our data sorted.
proc sort data=have; id date; run;
the solution
the retain
statement make values declared variables kept in memory data step iterates on rows of have
data set.
note _i
suffix added existing variables have
, i standing input.
data want(drop=id_i dose_i supply_i date_i); format id dose supply 8. date mmddyy10.; retain id dose supply date; set have(rename=(id=id_i dose=dose_i supply=supply_i date=date_i)) end=last; if _n_ = 1 goto propagate; if id_i = id , dose_i = dose do; supply = supply + supply_i; goto checklast; end; * when id or dose different previous row, ; * write observation want table. ; output; propagate: id = id_i; dose = dose_i; supply = supply_i; date = date_i; checklast: if last output; run;
a few things note here:
_n_
automatic sas variable indicating current iteration numberend=last
(used parameterset
statement) creates variable calledlast
(this arbitrary name) take on value1
when last observation readhave
, ,0
otherwise. use boolean variable @ end of data step.- keep in mind, in trying figure out, data step functions
for
loop, iterating on rows of source table.
results
id dose supply date 1234 5 30 01/01/2015 1234 10 60 02/01/2015 1234 5 30 04/01/2015 1234 2 30 05/01/2015 4321 5 30 07/01/2016 9876 2 60 05/01/2016 9876 10 30 07/01/2016
Comments
Post a Comment