[NLPL Infrastructure] Fwd: [CSC #479570] lustre-related error in custom installation of binutils

Stephan Oepen oe at ifi.uio.no
Wed May 5 14:59:17 UTC 2021


a very helpful reply from CSC; see below.  did you get this from their
RT queue already, andrey?

oe


---------- Forwarded message ---------
From: Henrik Nortamo via RT <research-support at csc.fi>
Date: Wed, May 5, 2021 at 10:02 AM
Subject: [CSC #479570] lustre-related error in custom installation of binutils
To: <oe at ifi.uio.no>


Hi,

Binutils is not provided as a module on Puhti, but the versions which
gcc 8.3.0 uses can be found under
/appl/spack/install-tree/gcc-4.8.5/binutils-2.31.1-xbslr3/lib/.

That particular error comes from our version of  lustre ( 2.12.6 ) not
supporting the "fallocate" systemcall (it has been implemented in
lustre  2.14.0, https://jira.whamcloud.com/browse/LU-3606). The
returned error code might also not be correct (see
https://jira.whamcloud.com/browse/LU-14301), which means that some
program fallbacks from fallocate is not working properly.

So I'm guessing easybuild builds binutils on a non lustre filesystem
e.g /tmp, /local_scratch, $TMPIDR or similar
checks that fallocate is supported and then when you try to compile on
lustre it fails. Not fully certain, but you might also
be able to disable fallocate support manually when building binutils
(could require patching the configure scripts).

A temporary workaround is to add "--no-posix-fallocate " to LD_FLAGS
when compiling.


Br.
Henrik Nortamo
CSC


On Tue May 04 23:19:05 2021, oe at ifi.uio.no wrote:
> dear colleagues:
>
> in the context of the NLPL use case in EOSC-Nordic, we are preparing
> an updated version of what we call the NLPL virtual laboratory,
> essentially a community-maintained collection of core software and
> data resources for natural language processing.  on Puhti, our virtual
> laboratory resides in '/projapp/nlpl/' and is collectively maintained
> by EOSC-Nordic project members at helsinki and oslo universities.
>
> for perfect parallelism of the installed software, we have now fully
> automated the process of compiling and installing a collection of
> dozens of packages, using EasyBuild.  in doing so, we use system-wide
> modules where they are available (in the exact same versions and
> configurations) and let EasyBuild fall back on building dependencies
> as needed.  on Puhti, that means we end up compiling, among other
> things, our own versions of gcc and GNU binutils.
>
> it now appears that the version of binutils created by the stock
> EasyBuild recipe on Puhti ends up incompatible with the lustre
> filesystem below '/projapp/'.  we have isolated the problem outside
> the EasyBuild environment, and it appears to boil down to ld.gold
> failing in fallocate(2) with
>
> [oe at puhti-login1 ~]$ module purge
> [oe at puhti-login1 ~]$ module use -a /projappl/nlpl/software/20/etc
> [oe at puhti-login1 ~]$ module load GCCcore/8.3.0 binutils/2.32
> [oe at puhti-login1 ~]$ module list
>
> Currently Loaded Modules:
>   1) GCCcore/8.3.0   2) binutils/2.32
>
>
>
> [oe at puhti-login1 ~]$ cat conftest.c
> int main (void) {
>   ;
>   return 0;
> }
> [oe at puhti-login1 ~]$ gcc conftest.c
> /projappl/nlpl/software/20/packages/binutils/2.32/bin/ld.gold: fatal
> error: a.out: Unknown error 524
> collect2: error: ld returned 1 exit status
> [oe at puhti-login1 ~]$ strace -f gcc conftest.c 2>&1 | grep fallocate
> [pid 110197] fallocate(21, 0, 0, 7840)  = -1 ENOTSUPP (Unknown error 524)
>
> our current hypothesis is that the standard way of EasyBuild
> bootstrapping gcc and binutils on Puhti ends up with a configuration
> that is incompatible with creating binaries on the lustre filesystem.
> when moving the above file to '/tmp/', say, i can compile it without
> errors.
>
> i realize that we are off the beaten track here and well outside of
> what i would expect as regular support from your end.  but our hope is
> that you might see the abstract appeal in fully parallel software
> installations across different systems, maybe even more so where this
> is managed within our researcher community, i.e. has the potential to
> shift some of the maintenance and support burden for
> discipline-specific software towards us (semi-expert) users :-).
>
> does the above error from ld.gold ring a bell for someone, by any chance?
>
> i see that the system-wide gcc 8.3.0 module on Puhti does not include
> binutils and was built using Spack; in fact, it appears there is no
> separate binutils module beyond the stock RHEL 7 binaries (binutils
> 2.27), is that correct?  we could of course try dropping our custom
> binutils from the EasyBuild dependency tree, but that would (a) reduce
> the degree of 'full' parallelism across systems and (b) require us to
> modify core EasyBuild recipes.  hence, if possible we would much
> rather understand the underlying nature of the above problem and
> resolve it.  i imagine this may lead to a refinement of the EasyBuild
> recipe for binutils, which we could then submit upstream.
>
> any comments on the above or suggestions for how to debug further will
> be warmly appreciated.
>
> with thanks in advance, oe
>
>


More information about the infrastructure mailing list