Fix parallel toy fitting IO problem with HDF using flush/fsync (!52) · Merge requests · Albert Puig Navarro / analysis-tools

Jonas Eschle requested to merge fix_toy_hdf_fail into master Jan 18, 2018

Attempt to fix the parallel toy problem. May found something: buffer-flush. (It does still not explain certain log findings. But is is a potential race-condition anyway)

We had:

get_lock()
write_to_file()  # (NFS) Buffer! Race-condition
relieve_lock()

... is the file already written or still in the buffer?

-> added a file.flush() with blocking fsync (file system sync).

@matzeni although I could (not yet) confirm the fix nor the bug with minimal examples, give it a try if you find the time to submit again the failing config (with 50 fits or similar), and let me know ;)

Edited Jan 18, 2018 by Jonas Eschle

Fix parallel toy fitting IO problem with HDF using flush/fsync

Merge request reports