large data volumes - Slowly Increasing Save Time When Saving As in Excel -
ok, stack overflow, far can tell, one's doozy.
i have created macro enabled excel file that, when run, performs following actions (high level):
- user selects template file (which macro enabled) via file dialog
- user selects data files (which not macro enabled) via file dialog
- macro steps through data files and, 1 one, opens them, formats data, migrates data new worksheet in intermediary workbook, , closes data file without saving it
- once files have been looped through, intermediary workbook saved, kept open
- once data files have been looped through, each sheet of intermediary workbook looped through, data in current worksheet transferred template file, , template file saved new, uniquely labeled file. 1 row of data in now-data-containing file copied summary sheet
(it's bit more complicated that, important aspects far can tell)
here's problem; number of data files being selected in thousands (so far biggest run we've attempted 4000 files). macro progresses, time takes these files save gets steadily longer. starts @ 5 seconds, end of files taking 5 minutes save.
the clue have there's iteration feature i've added that, once data files have been looped through, closes template file , opens new instance of different settings, , starts process on again. causes save time go normal , start growing again. summary file saved , closed during step, , new 1 opened new iteration.
i've considered closing , reopening template file every hundred data files or so, , i'll implement if have to, i'd rather proper solution problem band-aid approach. if open , close template file every time, avoid time problem, macro becomes wildly unstable, crash @ random points during run (but sometimes).
this on computer isolated internet or kind of network , saving solid state drive (we've tried control quite few variables).
anyhow, i'm pretty stumped, suggestions welcome!
option explicit public sub example() dim trial integer, trials integer, dataset integer dim trialchecker boolean dim starttime double, waittime double dim startdate date dim filesavename string dim copieddatarange range dim summaryruntimes worksheet, calcs worksheet, cutoffsshifts worksheet dim sheetobjects() worksheet dim intermediaryworkbook workbook, summary workbook, template workbook application.screenupdating = false application.calculation = xlcalculationmanual 'the 1 , trials set lbound , ubound funcitons, premise same trial = 1 trials workbooks.add set summary = activeworkbook 'i use 1 sheet keep track of how long different parts of code take run set summaryruntimes = summary.worksheets(1) summaryruntimes.name = "run times" summaryruntimes.cells(1, 1).value = "id" summaryruntimes.cells(1, 2).value = "data copy time (s)" summaryruntimes.cells(1, 3).value = "formula copy , calc time (s)" summaryruntimes.cells(1, 4).value = "summary copy time (s)" summaryruntimes.cells(1, 5).value = "save , cleanup time (s)" 'sheetnames defined elsewhere in code (it's global variable right now. intend change later). 'it's array of strings 6 elements. counter = lbound(sheetnames) ubound(sheetnames) summary.worksheets.add summary.activesheet.name = sheetnames(counter) next counter 'again, templatelocation defined elsewhere. it's string grabbed filedialog workbooks.open (templatelocation) set template = activeworkbook set calcs = template.sheets("calcs") set cutoffsshifts = template.sheets("log cutoffs & shifts") 'sheetobjects used convenient reference various sheets in template file. found 'it cleaned code bit. might it's unnecessary. counter = lbound(sheetnames) ubound(sheetnames) set sheetobjects(counter) = template.sheets(sheetnames(counter)) next counter 'this parameters given trial set in template file. trialchecker set elsewhere '(it checks yes/no dropdown in original spreadsheet). parameteraddresses range that's grabbed 'table object in original spreadsheet , contains these parameters go in template file. these 'will not change depending on trial, column = 1. trialparameters in same table, , 'parameters themselves. these depend on trial, , column equal trial number if trialchecker = true counter = lbound(parameteraddresses) ubound(parameteraddresses) cutoffsshifts.range(parameteraddresses(counter, 1)).value = trialparameters(counter, trial) next counter end if dataset = 1 intermediaryworkbook.worksheets.count - 1 'this start timers starttime = timer startdate = date 'this data copied intermediary file template. it's 5 'columns wide, can number of rows. summaryruntimes statement merely grabbing unique 'identifier of given worksheet intermediaryworkbook set copieddatarange = calcs.range("$a$3:$e$" & .worksheets(counter).usedrange.rows.count + 1) copieddatarange.value = intermediaryworkbook.worksheets(counter).range("$a$2:$e$" & .worksheets(counter).usedrange.rows.count).value summaryruntimes.cells(counter + 1, 1) = calcs.cells(3, 1).value end 'first timestamp summaryruntimes.cells(counter + 1, 2) = cstr(round(86400 * (date - startdate) + timer - starttime, 1)) starttime = timer startdate = date 'this statement copies down formulas go data (which aobut 100 columsn worth of formuals). 'throughout process, calculation set manual, calculation manually triggered here (don't ask 'me why twice. if recall, it's because pivot tables weird) set copiedformularange = calcs.range("$f$3:$kl$" & calcs.usedrange.rows.count) copiedformularange.filldown application.calculate template.refreshall application.calculate 'second timestamp summaryruntimes.cells(counter + 1, 3) = cstr(round(86400 * (date - startdate) + timer - starttime, 1)) starttime = timer startdate = date 'this separate function copies data template file summary sheet. 'i know can't see code, know copies 6 sets of 7 cells, 'as far can tell, it's not causing problem. timestamp supports idea, 'it's consistent , short call summarypopulate(summary, sheetnames, sheetobjects, r) r = r + 1 'third timestamp summaryruntimes.cells(counter + 1, 4) = cstr(round(86400 * (date - startdate) + timer - starttime, 1)) starttime = timer startdate = date 'these following few lines meant save template file new file. mentioned, 'things bogged down. filenamesuffix string set via inputbox. trialnames set via table object 'mentioned above, , array of strings. application.displayalerts = false if trialchecker = true filesavename = folderlocation & "\" & replace(calcs.cells(3, 1).value, "/", " ") & " ooip " & filenamesuffix & " - " & trialnames(1, trial) & ".xlsm" else filesavename = folderlocation & "\" & replace(calcs.cells(3, 1).value, "/", " ") & " ooip " & filenamesuffix & ".xlsm" end if template.saveas filename:=filesavename, conflictresolution:=xllocalsessionchanges application.displayalerts = true 'this part clears copied data , formulas. added 2 set nothing lines in hopes 'solve problem, doesn't seem copieddatarange.clearcontents copieddatarange.offset(1, 0).rows.delete set copieddatarange = nothing set copiedformularange = nothing 'fourth , final timestamp summaryruntimes.cells(counter + 1, 5) = cstr(round(86400 * (date - startdate) + timer - starttime, 1)) 'it seems run bit better if there's wait line here, i'm not sure why. waittime 'is grabbed original worksheet, , double application.wait (timeserial(hour(now()), minute(now()), second(now()) + waittime)) next dataset 'this saves summary file , closes , template file. process starts anew. 'this seems key resetting reduces run times. if trialchecker = true summary.saveas filename:=folderlocation & "\" & "ooip summary " & filenamesuffix & " - " & trialnames(1, trial) & ".xlsx" else summary.saveas filename:=folderlocation & "\" & "ooip summary " & filenamesuffix & ".xlsx" end if template.close false summary.close false next trial application.screenupdating = true application.calculation = xlcalculationautomatic intermediaryworkbook.close false end sub
sorry post answer, isn't, need bit of space here. went through code, found intermediateworkbook isn't defined , decided defining wouldn't make difference. feel have done might think of doing , study of code not discover haven't discovered. therefore solution in first separating processes , joining them again in different way - or perhaps not. key "solution": if parts can't joined, let them run separately. therefore task set create separate parts.
part 1 described in points 2 4, i.e. creation of intermediate workbook. haven't stated why user must select template before workbook created if has bearing template can opened , closed. important part of suggestion end process when intermediate workbook saved. close it. close template. , project done - part 1.
part 2 open intermediate file , loop through data, creating new files. each of these files based on template. may have provide code enable selection of correct template if there several choose form , if data in intermediate workbook don't support automatic selection. in process have intermediate workbook open plus 1 new file @ time. each file closed before new 1 created. @ end of process intermediate file closed well. (btw occurs me handling of template might cause of problem. in process description template never opened. instead, new workbooks created based on it, , inventor's design.)
part 3 create or open summary file. open each of newly created workbooks , copy 1 row summary. close each workbook , open next. @ end of process close summary workbook.
joining parts: frankly, try integrate part 3 part 2 outset. don't believe having 1 workbook open make difference. if split tasks.
your 2 or 3 separate procedures should in add-in perhaps or workbook nothing hold code (adding 1 more open workbook 2 or 3 others - excel can handle easily). code in workbook add sub calls 2 or 3 procs in turn, 1 after other.
in program structure problem might resurface in part 2 when might take progressively more time save each new workbook. if happens nature of problem have shifted , should easier understand and, hopefully, easier solve.
Comments
Post a Comment