Python 3.5: Tarfile append mode "ReadError" on empty tar -


i have encountered problem working creation of new tarfiles using latest python 3.5 package , tarfile module. problem similar discussed here , here. in former case suggested solution returns error readerror: empty header, , in latter case it's closed issue 8 years ago, patch should have been applied loading latest version of language. docs tarfile module explicitly state "a" can used create new tarfiles.

the issue arises in case of attempting create new tarfile. code generates replicated in full below; simple script using benchmark task in multiprocessing.

#just little script test copy times control  import os import os.path import time import shutil import multiprocessing import tarfile  global source; source = "/home/patches/documents/scripting projects/resources/test payload" global dest; dest = "/home/patches/desktop/copy benchmark  output.tar.bz2" global start global end global diff global testing; testing = true global numconsumers; numconsumers = multiprocessing.cpu_count()   #classes! class copyproc(multiprocessing.process):     def __init__(self, qtask):         multiprocessing.process.__init__(self)         self.qtask = qtask         os.chdir(source)      def run(self):         proc_name = self.name         while true:             next_task = self.qtask.get()             if next_task none:                 # poison pill means shutdown                 print('%s: exiting' % proc_name)                 self.qtask.task_done()                 break             next_task()             self.qtask.task_done()         return  class copyjob(object):     def __init__(self, a):         self.tgt =     def __call__(self):         tar = tarfile.open(dest, "a")         tar.add(self.tgt)         tar.close()  #function def announce():     print("starting copy benchmark - multiprocessing.")     foo = input("press key begin")  def starttimer():     global start     start = time.time()   def setup():     os.chdir(source)     a, b, files in os.walk(source):         file in files:         tasks.put(copyjob(file))     in range(numconsumers):         tasks.put(none)  def endtimer():     global end     end = time.time()  def prompt():     diff = end - start #   os.remove(dest)     print("the test took %s seconds" % str(diff))     bar = input("run again? y/n")     if bar == "n":         testing = false  #runtime if __name__ == '__main__':     multiprocessing.set_start_method("spawn")     tasks = multiprocessing.joinablequeue()     announce()     starttimer()     setup() consumers = [] in range(numconsumers):     consumers.append(copyproc(tasks)) w in consumers:     w.start() tasks.join() endtimer() prompt() 

edit add: problem that, instead of specific behaviour, script instead throws "readerror: empty header" exception on attempting open tarfile.

so turns out, issue having multiple processes trying open file more or less simultaneously causing issue. solution use form of locking prevent this.

in case, used simple multiprocessing.lock() object , had processes acquire before appending , release afterward. solution still faster performing appends in single process. not sure why.

i hope if, me, had problem, solution works well.


Comments

Popular posts from this blog

javascript - Create a stacked percentage column -

Optimising Firebase database by automatically overwriting data -

javascript - Angular UI-Grid customTemplate directive causing rows to load slowly/? -