[NLPL Task Force (A)] [uninett.no #199370] Accel Queue
thierry.toutain via RT
support at metacenter.no
Tue Dec 3 10:11:58 UTC 2019
Hello Jeremy,
we will check with the user how his jobs do use gpus,
Thierry
On Mon Dec 02 16:53:57 2019, jeremycb at ifi.uio.no wrote:
Hello,
I wanted to point out that over the last few days it has been very
difficult to schedule an accel job on Saga. There seems to be a single
user that has effectively saturated the gpu queue, but their jobs don't
use the gpus effectively:
squeue -p accel > /tmp/accel
for i in $(squeue -p accel | egrep 'c[0-9]-[0-9]$' | sort -u | awk
'{print $NF}'); do \
ssh $i nvidia-smi | grep Default; \
done > ~/nvidia-smi.log
Would it be possible to let the user know that this situation is
suboptimal?
Thanks,
Jeremy Barnes
Language Technology Group
University of Oslo
Office 7645
jeremycb at ifi.uio.no
More information about the infrastructure
mailing list