parallel processing - worse case scenario: launched two copies of a program which appends lines to a file -
i have python program performs simple operation on file:
with open(self.cache_filename_url, "a", encoding="utf8") f: w = csv.writer(f, delimiter=',', quotechar='"', lineterminator='\n') w.writerow([cache_url, rpd_products])
as can see opens file , appends csv line it. lot, in loop.
i accidentally ran 2 copies of program simultaneously, think have been appending file simultaneously. trying determine worst-case-scenario file corruption.
do think writes @ least atomic operations in case? example wouldn't problem me:
old line old line new line written instance 1 new line written instance 2 new line written 1
this would problem me:
old line old line [half of new line written instance 1] [half of new line instance 2] etc
to put way, possible 2 append operations "interfere" each other?
edit: using windows 7
opening same file multiple times in shared write mode can problematic. and, if don't open in shared mode, you'll 1 of them throwing exceptions cannot open file.
if shared mode: both instances have own internal pointer. in cases, write independently. get:
process opens file, sets pointer end (byte 1024) process b opens file, sets pointer end (byte 1024) process b writes @ byte 1024 , closes file process writes @ byte 1024 , closes file.
both processes have written file @ same location. you've lost record process b, , depending on how close works (if truncates), if lines writes different lengths, part of process b if line longer.
if in exclusive mode, 1 process fail open file, , whatever exception handling have kick in.
which mode in can system dependent, python doesn't seem provide mechanisms controlling share mode.
Comments
Post a Comment