[NLPL Infrastructure] Fwd: [CSC #479570] lustre-related error in custom installation of binutils
Andrey Kutuzov
andreku at ifi.uio.no
Wed May 5 15:07:30 UTC 2021
No, I have not received this.
So, we probably should try to add "--no-posix-fallocate" to the binutils
easyconfig, right?
On 05.05.2021 16:59, Stephan Oepen wrote:
> a very helpful reply from CSC; see below. did you get this from their
> RT queue already, andrey?
>
> oe
>
>
> ---------- Forwarded message ---------
> From: Henrik Nortamo via RT <research-support at csc.fi>
> Date: Wed, May 5, 2021 at 10:02 AM
> Subject: [CSC #479570] lustre-related error in custom installation of binutils
> To: <oe at ifi.uio.no>
>
>
> Hi,
>
> Binutils is not provided as a module on Puhti, but the versions which
> gcc 8.3.0 uses can be found under
> /appl/spack/install-tree/gcc-4.8.5/binutils-2.31.1-xbslr3/lib/.
>
> That particular error comes from our version of lustre ( 2.12.6 ) not
> supporting the "fallocate" systemcall (it has been implemented in
> lustre 2.14.0, https://jira.whamcloud.com/browse/LU-3606). The
> returned error code might also not be correct (see
> https://jira.whamcloud.com/browse/LU-14301), which means that some
> program fallbacks from fallocate is not working properly.
>
> So I'm guessing easybuild builds binutils on a non lustre filesystem
> e.g /tmp, /local_scratch, $TMPIDR or similar
> checks that fallocate is supported and then when you try to compile on
> lustre it fails. Not fully certain, but you might also
> be able to disable fallocate support manually when building binutils
> (could require patching the configure scripts).
>
> A temporary workaround is to add "--no-posix-fallocate " to LD_FLAGS
> when compiling.
>
>
> Br.
> Henrik Nortamo
> CSC
>
>
> On Tue May 04 23:19:05 2021, oe at ifi.uio.no wrote:
>> dear colleagues:
>>
>> in the context of the NLPL use case in EOSC-Nordic, we are preparing
>> an updated version of what we call the NLPL virtual laboratory,
>> essentially a community-maintained collection of core software and
>> data resources for natural language processing. on Puhti, our virtual
>> laboratory resides in '/projapp/nlpl/' and is collectively maintained
>> by EOSC-Nordic project members at helsinki and oslo universities.
>>
>> for perfect parallelism of the installed software, we have now fully
>> automated the process of compiling and installing a collection of
>> dozens of packages, using EasyBuild. in doing so, we use system-wide
>> modules where they are available (in the exact same versions and
>> configurations) and let EasyBuild fall back on building dependencies
>> as needed. on Puhti, that means we end up compiling, among other
>> things, our own versions of gcc and GNU binutils.
>>
>> it now appears that the version of binutils created by the stock
>> EasyBuild recipe on Puhti ends up incompatible with the lustre
>> filesystem below '/projapp/'. we have isolated the problem outside
>> the EasyBuild environment, and it appears to boil down to ld.gold
>> failing in fallocate(2) with
>>
>> [oe at puhti-login1 ~]$ module purge
>> [oe at puhti-login1 ~]$ module use -a /projappl/nlpl/software/20/etc
>> [oe at puhti-login1 ~]$ module load GCCcore/8.3.0 binutils/2.32
>> [oe at puhti-login1 ~]$ module list
>>
>> Currently Loaded Modules:
>> 1) GCCcore/8.3.0 2) binutils/2.32
>>
>>
>>
>> [oe at puhti-login1 ~]$ cat conftest.c
>> int main (void) {
>> ;
>> return 0;
>> }
>> [oe at puhti-login1 ~]$ gcc conftest.c
>> /projappl/nlpl/software/20/packages/binutils/2.32/bin/ld.gold: fatal
>> error: a.out: Unknown error 524
>> collect2: error: ld returned 1 exit status
>> [oe at puhti-login1 ~]$ strace -f gcc conftest.c 2>&1 | grep fallocate
>> [pid 110197] fallocate(21, 0, 0, 7840) = -1 ENOTSUPP (Unknown error 524)
>>
>> our current hypothesis is that the standard way of EasyBuild
>> bootstrapping gcc and binutils on Puhti ends up with a configuration
>> that is incompatible with creating binaries on the lustre filesystem.
>> when moving the above file to '/tmp/', say, i can compile it without
>> errors.
>>
>> i realize that we are off the beaten track here and well outside of
>> what i would expect as regular support from your end. but our hope is
>> that you might see the abstract appeal in fully parallel software
>> installations across different systems, maybe even more so where this
>> is managed within our researcher community, i.e. has the potential to
>> shift some of the maintenance and support burden for
>> discipline-specific software towards us (semi-expert) users :-).
>>
>> does the above error from ld.gold ring a bell for someone, by any chance?
>>
>> i see that the system-wide gcc 8.3.0 module on Puhti does not include
>> binutils and was built using Spack; in fact, it appears there is no
>> separate binutils module beyond the stock RHEL 7 binaries (binutils
>> 2.27), is that correct? we could of course try dropping our custom
>> binutils from the EasyBuild dependency tree, but that would (a) reduce
>> the degree of 'full' parallelism across systems and (b) require us to
>> modify core EasyBuild recipes. hence, if possible we would much
>> rather understand the underlying nature of the above problem and
>> resolve it. i imagine this may lead to a refinement of the EasyBuild
>> recipe for binutils, which we could then submit upstream.
>>
>> any comments on the above or suggestions for how to debug further will
>> be warmly appreciated.
>>
>> with thanks in advance, oe
>>
>>
--
Andrey
Language Technology Group (LTG)
University of Oslo
More information about the infrastructure
mailing list