[NLPL Task Force (A)] OPUS directory and file counts
Stephan Oepen
oe at ifi.uio.no
Fri Dec 8 09:10:22 UTC 2017
hi again, joerg,
i am preparing to put a larger collection of embeddings and
multi-lingual corpora on-line on Abel and stumbled over the size of
the part of OPUS that you replicated there. 830 gigabytes in six
million files and two million directories is problematic in two ways:
it consumes most of the space we currently have available (one
terabyte), and it badly challenges the back-up system and our emerging
filesystem monitoring scripts.
truth be told, i would suggest we remove the OPUS replicate on Abel
for now and rather look into putting a complete mirror onto our much
larger storage area on NIRD (the NorStore successor). that will mean
that users who need part of the data ‘on-line’ (on the local
filesystem) will either have to work on Taito or stage it from NIRD to
Abel. that is the general design of the current norwegian set-up,
though my impression is that on the Abel successor (Fram) integration
with NIRD will be somewhat tighter.
i am sorry to suggest undoing the replica you did on Abel, but i hope
you can understand why i am making this proposal!? which users to do
you reckon would be affected, i.e. what usage scenarios require parts
of OPUS to be available on-line?
best wishes, oe
More information about the infrastructure
mailing list