[NLPL Task Force (A)] Accel Queue
Jeremy Claude Barnes
jeremycb at ifi.uio.no
Tue Dec 3 09:27:39 UTC 2019
Here are the logs, by the way.
Jeremy Barnes
Language Technology Group<https://www.mn.uio.no/ifi/english/research/groups/ltg/>
University of Oslo
Office 7645
jeremycb at ifi.uio.no<mailto:jeremycb at ifi.uio.no>
________________________________
From: Jeremy Claude Barnes
Sent: 02 December 2019 16:53
To: support at metacenter.no
Cc: infrastructure at nlpl.eu
Subject: Accel Queue
Hello,
I wanted to point out that over the last few days it has been very difficult to schedule an accel job on Saga. There seems to be a single user that has effectively saturated the gpu queue, but their jobs don't use the gpus effectively:
squeue -p accel > /tmp/accel
for i in $(squeue -p accel | egrep 'c[0-9]-[0-9]$' | sort -u | awk
'{print $NF}'); do \
ssh $i nvidia-smi | grep Default; \
done > ~/nvidia-smi.log
Would it be possible to let the user know that this situation is suboptimal?
Thanks,
Jeremy Barnes
Language Technology Group<https://www.mn.uio.no/ifi/english/research/groups/ltg/>
University of Oslo
Office 7645
jeremycb at ifi.uio.no<mailto:jeremycb at ifi.uio.no>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.nlpl.eu/archives/infrastructure/attachments/20191203/dc781c93/attachment.htm>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: accel.log
Type: text/x-log
Size: 2712 bytes
Desc: accel.log
URL: <http://lists.nlpl.eu/archives/infrastructure/attachments/20191203/dc781c93/attachment.bin>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: nvidia-smi.log
Type: text/x-log
Size: 2560 bytes
Desc: nvidia-smi.log
URL: <http://lists.nlpl.eu/archives/infrastructure/attachments/20191203/dc781c93/attachment-0001.bin>
More information about the infrastructure
mailing list