[NLPL Task Force (A)] Jobs' output

Stephan Oepen oe at ifi.uio.no
Fri Sep 27 20:21:29 UTC 2019


hi gonbo,

i cannot say that i have used OpenNMT much myself, but more generally:
unless i run something that is very i/o-intensive, i do not take the
trouble of copying input and output data back and forth between the
$SCRATCH filesystem, i.e. i doubt you need to worry about chkfile and
friends.  i would just work out of your home directory, i.e. read and
write data there.

the SLURM log file you sent does not look as if the job actually has
completed?  i assume by 'output files' you mean files generated during
the OpenNMT run, i.e. the actual model file?  i might guess that the
model is only serialized to disk upon completion of the training, so
could it be the case that your job actually had not gotten to that
point?

a general piece of advice: to debug it might help to reduce the
problem to a tiny training file, possibly even something that can
complete in a matter of a few minutes on a cpu node.  that should
allow you to find out where the output file(s) end up, and once you
have a working set-up, you can submit larger jobs (to the gpu nodes).

best wishes, oe

On Fri, Sep 27, 2019 at 10:09 PM Gongbo Tang <gongbo.tang at lingfil.uu.se> wrote:
>
> Hi,
>
>
> I met a problem. I cannot find any output files/models after running a job. Or the job did not generate any models during running.
>
>
> I am using Open-NMT 0.2.1, maintained by NLPL. I did not find any "Saving checkpoint ..." information from the log file which should be found. I attached the slurm file and the job script.
>
>
> I tried to use "chkfile" or "cleanup" command to save the outputs, following the guide here (https://www.uio.no/english/services/it/research/hpc/abel/help/user-guide/job-scripts.html#Work_Directory), but I was told that "chkfile" and "cleanup" are not found.
>
>
> I also tried to set the output directory as the home directory(~, /usit/abel/u1/gtang). I still got nothing.
>
>
> Could you please tell me how can I get the job's outputs? Thanks a lot!
>
>
> Best,
>
> Gongbo
>
>
>
>
>
>
>
>
>
> När du har kontakt med oss på Uppsala universitet med e-post så innebär det att vi behandlar dina personuppgifter. För att läsa mer om hur vi gör det kan du läsa här: http://www.uu.se/om-uu/dataskydd-personuppgifter/
>
> E-mailing Uppsala University means that we will process your personal data. For more information on how this is performed, please read here: http://www.uu.se/en/about-uu/data-protection-policy




More information about the infrastructure mailing list