Access reason why slurm stopped a job -


is there way find out why job canceled slurm? distinguish cases resource limit hit other reasons (like manual cancellation). in case resource limit hit, know one.

the slurm log file contains information explicitly. written job's output file like:

job <jobid> cancelled @ <time> due time limit 

or

job <jobid> exceeded <mem> memory limit, being killed: 

or

job <jobid> cancelled @ <time> due node failure 

etc.


Comments

Popular posts from this blog

php - Vagrant up error - Uncaught Reflection Exception: Class DOMDocument does not exist -

vue.js - Create hooks for automated testing -

Add new key value to json node in java -