[NLPL Task Force (A)] Accel Queue

Jeremy Claude Barnes jeremycb at ifi.uio.no
Tue Dec 3 09:27:39 UTC 2019


Here are the logs, by the way.


Jeremy Barnes

Language Technology Group<https://www.mn.uio.no/ifi/english/research/groups/ltg/>

University of Oslo

Office 7645

jeremycb at ifi.uio.no<mailto:jeremycb at ifi.uio.no>

________________________________
From: Jeremy Claude Barnes
Sent: 02 December 2019 16:53
To: support at metacenter.no
Cc: infrastructure at nlpl.eu
Subject: Accel Queue


Hello,


I wanted to point out that over the last few days it has been very difficult to schedule an accel job on Saga. There seems to be a single user that has effectively saturated the gpu queue, but their jobs don't use the gpus effectively:


squeue -p accel > /tmp/accel
for i in $(squeue -p accel | egrep 'c[0-9]-[0-9]$' | sort -u | awk
'{print $NF}'); do \
  ssh $i nvidia-smi | grep Default; \
done > ~/nvidia-smi.log


Would it be possible to let the user know that this situation is suboptimal?


Thanks,


Jeremy Barnes

Language Technology Group<https://www.mn.uio.no/ifi/english/research/groups/ltg/>

University of Oslo

Office 7645

jeremycb at ifi.uio.no<mailto:jeremycb at ifi.uio.no>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.nlpl.eu/archives/infrastructure/attachments/20191203/dc781c93/attachment.htm>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: accel.log
Type: text/x-log
Size: 2712 bytes
Desc: accel.log
URL: <http://lists.nlpl.eu/archives/infrastructure/attachments/20191203/dc781c93/attachment.bin>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: nvidia-smi.log
Type: text/x-log
Size: 2560 bytes
Desc: nvidia-smi.log
URL: <http://lists.nlpl.eu/archives/infrastructure/attachments/20191203/dc781c93/attachment-0001.bin>


More information about the infrastructure mailing list