<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
</head>
<body style="word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space;" class="">
<div class=""><br class="">
</div>
<div class="">So, my subset of OPUS data now occupies 715GB on abel.</div>
<div class="">Let me know if that is OK - otherwise I can reduce by, for example, removing monolingual data files or the plain text bitexts that can be generated from the native XML versions.</div>
<br class="">
<div apple-content-edited="true" class="">
<div style="color: rgb(0, 0, 0); letter-spacing: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px; word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space;" class="">
<div style="color: rgb(0, 0, 0); letter-spacing: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px; word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space;" class="">
<div style="color: rgb(0, 0, 0); letter-spacing: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px; word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space;" class="">
<div class="" style="word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space;">
<span class="" style="orphans: 2; widows: 2;">All the best,</span></div>
<div class="" style="orphans: 2; widows: 2; word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space;">
Jörg</div>
<div class="" style="word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space;">
<span class="" style="orphans: 2; widows: 2;"><br class="">
</span></div>
<div class="" style="word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space;">
<span class="" style="orphans: 2; widows: 2;">********************************************************************************************</span><br class="" style="orphans: 2; widows: 2;">
<span class="" style="orphans: 2; widows: 2;">Jörg Tiedemann</span></div>
<div class="" style="word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space;">
<span class="" style="orphans: 2; widows: 2;">Language Technology<span class="Apple-tab-span" style="white-space: pre;">
</span></span><a href="https://blogs.helsinki.fi/language-technology/" class="">https://blogs.helsinki.fi/language-technology/</a></div>
<div class=""><span style="orphans: 2; widows: 2;" class="">University of Helsinki</span></div>
</div>
</div>
</div>
</div>
<br class="">
<div>
<blockquote type="cite" class="">
<div class="">On 19 Dec 2018, at 15:43, Stephan Oepen <<a href="mailto:oe@ifi.uio.no" class="">oe@ifi.uio.no</a>> wrote:</div>
<br class="Apple-interchange-newline">
<div class="">hi joerg,<br class="">
<br class="">
our storage on Abel was extended to two terabytes this fall, and<br class="">
currently we have some 800 gigabytes available.<br class="">
<br class="">
i feel i (still) know too little about OPUS to say whether a partial<br class="">
replica on Abel would be beneficial to NLPL users? could you suggest<br class="">
a sub-set (below 800 gigabytes) to mirror from Taito, and sketch a<br class="">
typical use case? could we sketch the reciple for a user to train<br class="">
their OpenNMT-py system (more or less) straight from the OPUS<br class="">
directory?<br class="">
<br class="">
cheers, oe<br class="">
<br class="">
On Wed, Dec 19, 2018 at 2:37 PM Tiedemann, Jörg<br class="">
<<a href="mailto:jorg.tiedemann@helsinki.fi" class="">jorg.tiedemann@helsinki.fi</a>> wrote:<br class="">
<blockquote type="cite" class=""><br class="">
<br class="">
This is especially for Stephan: One of the deliverables for this year in the OPUS activity is to create a partial mirror of OPUS data on abel. So far, I still don’t really know what we would like to make available and what kind of space we have for that on
abel. In some sense, it could be enough to have that availability via the NIRD storage that you already fill with OPUS data, right? This also counts on longterm storage I guess. I also have the data in IDA here on CSC.<br class="">
<br class="">
This is activity G1.4 and i wonder if I have to do something about it:<br class="">
<a href="http://wiki.nlpl.eu/index.php/Infrastructure/home" class="">http://wiki.nlpl.eu/index.php/Infrastructure/home</a><br class="">
<br class="">
All the best,<br class="">
Jörg<br class="">
<br class="">
********************************************************************************************<br class="">
Jörg Tiedemann<br class="">
Language Technology https://blogs.helsinki.fi/language-technology/<br class="">
University of Helsinki<br class="">
<br class="">
</blockquote>
</div>
</blockquote>
</div>
<br class="">
</body>
</html>