[NLPL Infrastructure] Fwd: [CSC #479570] lustre-related error in custom installation of binutils

Andrey Kutuzov andreku at ifi.uio.no
Wed May 5 19:13:53 UTC 2021


OK, it seems that in fact all we have to do is add the following line to 
the GCCcore easyconfig:

use_gold_linker = False

Then it will use the good old ld instead of ld.gold, and this eliminates 
the fallocate problem (may be at the cost of being marginally slower).

I have tested it on my own Easybuild environment on Puhti. However, I 
can't change our NLPL easyconfig at
/projappl/nlpl/software/20/packages/EasyBuild/4.3.4/easybuild/easyconfigs/g/GCCcore/GCCcore-8.3.0.eb

...because it is write-protected for the group.

I guess we would want to apply a patch to this easyconfig at the setup
stage once we detect that we are on Puhti.

On 05.05.2021 17:33, Stephan Oepen wrote:
> if that in fact is a valid option to the configure script for binutils; 
> could you check?
> 
> oe
> 
> 
> On Wed, 5 May 2021 at 17:08 Andrey Kutuzov <andreku at ifi.uio.no 
> <mailto:andreku at ifi.uio.no>> wrote:
> 
>     No, I have not received this.
> 
>     So, we probably should try to add "--no-posix-fallocate" to the
>     binutils
>     easyconfig, right?
> 
>     On 05.05.2021 16:59, Stephan Oepen wrote:
>      > a very helpful reply from CSC; see below.  did you get this from
>     their
>      > RT queue already, andrey?
>      >
>      > oe
>      >
>      >
>      > ---------- Forwarded message ---------
>      > From: Henrik Nortamo via RT <research-support at csc.fi
>     <mailto:research-support at csc.fi>>
>      > Date: Wed, May 5, 2021 at 10:02 AM
>      > Subject: [CSC #479570] lustre-related error in custom
>     installation of binutils
>      > To: <oe at ifi.uio.no <mailto:oe at ifi.uio.no>>
>      >
>      >
>      > Hi,
>      >
>      > Binutils is not provided as a module on Puhti, but the versions which
>      > gcc 8.3.0 uses can be found under
>      > /appl/spack/install-tree/gcc-4.8.5/binutils-2.31.1-xbslr3/lib/.
>      >
>      > That particular error comes from our version of  lustre ( 2.12.6
>     ) not
>      > supporting the "fallocate" systemcall (it has been implemented in
>      > lustre  2.14.0, https://jira.whamcloud.com/browse/LU-3606
>     <https://jira.whamcloud.com/browse/LU-3606>). The
>      > returned error code might also not be correct (see
>      > https://jira.whamcloud.com/browse/LU-14301
>     <https://jira.whamcloud.com/browse/LU-14301>), which means that some
>      > program fallbacks from fallocate is not working properly.
>      >
>      > So I'm guessing easybuild builds binutils on a non lustre filesystem
>      > e.g /tmp, /local_scratch, $TMPIDR or similar
>      > checks that fallocate is supported and then when you try to
>     compile on
>      > lustre it fails. Not fully certain, but you might also
>      > be able to disable fallocate support manually when building binutils
>      > (could require patching the configure scripts).
>      >
>      > A temporary workaround is to add "--no-posix-fallocate " to LD_FLAGS
>      > when compiling.
>      >
>      >
>      > Br.
>      > Henrik Nortamo
>      > CSC
>      >
>      >
>      > On Tue May 04 23:19:05 2021, oe at ifi.uio.no <mailto:oe at ifi.uio.no>
>     wrote:
>      >> dear colleagues:
>      >>
>      >> in the context of the NLPL use case in EOSC-Nordic, we are preparing
>      >> an updated version of what we call the NLPL virtual laboratory,
>      >> essentially a community-maintained collection of core software and
>      >> data resources for natural language processing.  on Puhti, our
>     virtual
>      >> laboratory resides in '/projapp/nlpl/' and is collectively
>     maintained
>      >> by EOSC-Nordic project members at helsinki and oslo universities.
>      >>
>      >> for perfect parallelism of the installed software, we have now fully
>      >> automated the process of compiling and installing a collection of
>      >> dozens of packages, using EasyBuild.  in doing so, we use
>     system-wide
>      >> modules where they are available (in the exact same versions and
>      >> configurations) and let EasyBuild fall back on building dependencies
>      >> as needed.  on Puhti, that means we end up compiling, among other
>      >> things, our own versions of gcc and GNU binutils.
>      >>
>      >> it now appears that the version of binutils created by the stock
>      >> EasyBuild recipe on Puhti ends up incompatible with the lustre
>      >> filesystem below '/projapp/'.  we have isolated the problem outside
>      >> the EasyBuild environment, and it appears to boil down to ld.gold
>      >> failing in fallocate(2) with
>      >>
>      >> [oe at puhti-login1 ~]$ module purge
>      >> [oe at puhti-login1 ~]$ module use -a /projappl/nlpl/software/20/etc
>      >> [oe at puhti-login1 ~]$ module load GCCcore/8.3.0 binutils/2.32
>      >> [oe at puhti-login1 ~]$ module list
>      >>
>      >> Currently Loaded Modules:
>      >>    1) GCCcore/8.3.0   2) binutils/2.32
>      >>
>      >>
>      >>
>      >> [oe at puhti-login1 ~]$ cat conftest.c
>      >> int main (void) {
>      >>    ;
>      >>    return 0;
>      >> }
>      >> [oe at puhti-login1 ~]$ gcc conftest.c
>      >> /projappl/nlpl/software/20/packages/binutils/2.32/bin/ld.gold: fatal
>      >> error: a.out: Unknown error 524
>      >> collect2: error: ld returned 1 exit status
>      >> [oe at puhti-login1 ~]$ strace -f gcc conftest.c 2>&1 | grep fallocate
>      >> [pid 110197] fallocate(21, 0, 0, 7840)  = -1 ENOTSUPP (Unknown
>     error 524)
>      >>
>      >> our current hypothesis is that the standard way of EasyBuild
>      >> bootstrapping gcc and binutils on Puhti ends up with a configuration
>      >> that is incompatible with creating binaries on the lustre
>     filesystem.
>      >> when moving the above file to '/tmp/', say, i can compile it without
>      >> errors.
>      >>
>      >> i realize that we are off the beaten track here and well outside of
>      >> what i would expect as regular support from your end.  but our
>     hope is
>      >> that you might see the abstract appeal in fully parallel software
>      >> installations across different systems, maybe even more so where
>     this
>      >> is managed within our researcher community, i.e. has the
>     potential to
>      >> shift some of the maintenance and support burden for
>      >> discipline-specific software towards us (semi-expert) users :-).
>      >>
>      >> does the above error from ld.gold ring a bell for someone, by
>     any chance?
>      >>
>      >> i see that the system-wide gcc 8.3.0 module on Puhti does not
>     include
>      >> binutils and was built using Spack; in fact, it appears there is no
>      >> separate binutils module beyond the stock RHEL 7 binaries (binutils
>      >> 2.27), is that correct?  we could of course try dropping our custom
>      >> binutils from the EasyBuild dependency tree, but that would (a)
>     reduce
>      >> the degree of 'full' parallelism across systems and (b) require
>     us to
>      >> modify core EasyBuild recipes.  hence, if possible we would much
>      >> rather understand the underlying nature of the above problem and
>      >> resolve it.  i imagine this may lead to a refinement of the
>     EasyBuild
>      >> recipe for binutils, which we could then submit upstream.
>      >>
>      >> any comments on the above or suggestions for how to debug
>     further will
>      >> be warmly appreciated.
>      >>
>      >> with thanks in advance, oe
>      >>
>      >>
> 
> 
>     -- 
>     Andrey
>     Language Technology Group (LTG)
>     University of Oslo
> 


-- 
Andrey
Language Technology Group (LTG)
University of Oslo


More information about the infrastructure mailing list