[NLPL Task Force (A)] gpu usage on Abel

Fri May 10 07:51:00 UTC 2019

colleagues,

gpu availability feels unusually bad the past few days, and i am
wondering whether some users (by oversight) might be tieing up more
resources than they actually use?

unless i mis-read something, the following suggests that the jobs by
'michalelm' are (just now) only using one gpu on each node:

$ squeue --partition=accel
$ for i in c19-1 c19-11 c19-15 c19-16; do ssh $i nvidia-smi; done

could you look a little more closely onto these jobs, try to determine
whether my impression in fact is correct, and if so gently educate the
user about nuances of gpu resource requests?

with thanks in advance, oe