[NLPL Task Force (A)] OPUS directory and file counts

Tiedemann, Jörg jorg.tiedemann at helsinki.fi
Fri Dec 8 09:41:55 UTC 2017


Sure, we can remove it again. I'm traveling until Tuesday. If this is urgent, go ahead and remove the opus directory yourself in data.

My apologies for causing all the trouble ...

Jörg 

> On 8 Dec 2017, at 10.10, Stephan Oepen <oe at ifi.uio.no> wrote:
> 
> hi again, joerg,
> 
> i am preparing to put a larger collection of embeddings and
> multi-lingual corpora on-line on Abel and stumbled over the size of
> the part of OPUS that you replicated there.  830 gigabytes in six
> million files and two million directories is problematic in two ways:
> it consumes most of the space we currently have available (one
> terabyte), and it badly challenges the back-up system and our emerging
> filesystem monitoring scripts.
> 
> truth be told, i would suggest we remove the OPUS replicate on Abel
> for now and rather look into putting a complete mirror onto our much
> larger storage area on NIRD (the NorStore successor).  that will mean
> that users who need part of the data ‘on-line’ (on the local
> filesystem) will either have to work on Taito or stage it from NIRD to
> Abel.  that is the general design of the current norwegian set-up,
> though my impression is that on the Abel successor (Fram) integration
> with NIRD will be somewhat tighter.
> 
> i am sorry to suggest undoing the replica you did on Abel, but i hope
> you can understand why i am making this proposal!?  which users to do
> you reckon would be affected, i.e. what usage scenarios require parts
> of OPUS to be available on-line?
> 
> best wishes, oe




More information about the infrastructure mailing list