Access reason why slurm stopped a job -


is there way find out why job canceled slurm? distinguish cases resource limit hit other reasons (like manual cancellation). in case resource limit hit, know one.

the slurm log file contains information explicitly. written job's output file like:

job <jobid> cancelled @ <time> due time limit 

or

job <jobid> exceeded <mem> memory limit, being killed: 

or

job <jobid> cancelled @ <time> due node failure 

etc.


Comments

Popular posts from this blog

vue.js - Create hooks for automated testing -

php - Vagrant up error - Uncaught Reflection Exception: Class DOMDocument does not exist -

serial port - hub4com OVERRUN Error -