Compiling for newer Intel CPUs with an older Intel build system?

Discussion:

Sean Byland

2016-03-04 16:11:12 UTC

Hello,

I’m trying to help users get autotools-based projects to compile in our somewhat unique environment. It’s fairly common for users to want to compile on a Intel ivybridge system (node) with Intel broadwell-specific (a superset of CPU instructions) performance optimization to run elsewhere (a compute node). With default options configure won’t handle this scenario because the build system can’t execute the Broadwell-specific x86 instructions. In varying degrees of success configure will work by:

1. Setting --build to a slightly different canonical name to the --build name, indicating to configure that it should do a cross-compile

2. Generating a config.cache on the Ivybridge compute node, which shares the majority of the file system with the Sandybridge system and can successfully execute everything. Then point configure on the sandybridge system at the cache generated while using the Ivybridge CPU.

3. Setup configure to use a test “launcher,” so configure tests will be launched on the Ivybridge system.

I like option one because it seems to follow a use-case that the autotools were designed for but I seem to get regular failures because (for example) the packager will use AC_TRY_RUN without defining a sensible “[action-if-cross-compiling]”. I like that option three allows configure’s main code path to function as intended but strongly dislike that it requires use of an Broadwell CPU that won’t always be available for every build and would probably require hacking and/or a configure user to perform packager actions. Option two get’s test results from the desired execution environment, allows configure to run really fast, and only requires minimal use of the Ivybridge system. Is using the cache in this manner generally a bad idea?

I’d also appreciate any general feedback, in the past my lack of autotools knowledge has led me to “fight” things and so I’d like to avoid deviating too far from what these tools were designed for.

Sincerely,

Sean

P.S. Thanks A. Duret-Lutz for making the very useful tutorial:

https://www.lrde.epita.fr/~adl/autotools.html

Nick Bowler

2016-03-04 18:24:28 UTC

Permalink

Post by Sean Byland
I’m trying to help users get autotools-based projects to compile in our
somewhat unique environment. It’s fairly common for users to want to compile
on a Intel ivybridge system (node) with Intel broadwell-specific (a superset
of CPU instructions) performance optimization to run elsewhere (a compute
node). With default options configure won’t handle this scenario because the
build system can’t execute the Broadwell-specific x86 instructions. In
1. Setting --build to a slightly different canonical name to the --build
name, indicating to configure that it should do a cross-compile

Just so you know, you can directly force cross compilation mode by
setting cross_compiling=yes, as in:

./configure cross_compiling=yes CFLAGS="..."

which might be more straightforward than faking out --build.

Post by Sean Byland
2. Generating a config.cache on the Ivybridge compute node, which shares
the majority of the file system with the Sandybridge system and can
successfully execute everything. Then point configure on the sandybridge
system at the cache generated while using the Ivybridge CPU.

This will work fine if the systems are similar enough, although probably
simpler to just directly share the build directories rather than mucking
with config.cache.

Post by Sean Byland
3. Setup configure to use a test “launcher,” so configure tests will be
launched on the Ivybridge system.

Yuck.

Post by Sean Byland
I like option one because it seems to follow a use-case that the autotools
were designed for but I seem to get regular failures because (for example)
the packager will use AC_TRY_RUN without defining a sensible
“[action-if-cross-compiling]”.

Right. Unfortunately many packages do not properly support cross
compiling.

Do you expect the actual configure test results to matter? Because
option 4 is to leave the CPU-specific optimizations off when you run
configure, then turn them on when you build. For example:

./configure
[...]
make CFLAGS="-O2 -march=broadwell"

Unless some hand-written assembly is being selected based on configure
tests, I expect this would work fine in most cases (but some packages
don't handle user-set CFLAGS properly).

For configure scripts that cache test results, you can also just force
the results of problematic tests by setting the cache variable, as in:

./configure foo_cv_xyz=bar

But as with cross-compilation, not all packages support this properly
(and usually the cache variables are not documented).

Cheers,
Nick

Nick Bowler

2016-03-04 19:25:19 UTC

Permalink

[...]

Post by Nick Bowler

2. Generating a config.cache on the Ivybridge compute node, which
shares the majority of the file system with the Sandybridge system
and can successfully execute everything. Then point configure on the
sandybridge system at the cache generated while using the Ivybridge
CPU.

This will work fine if the systems are similar enough, although
probably simpler to just directly share the build directories rather
than mucking with config.cache.

I agree that this is good option. The only reason I liked the cache
idea is because I thought it would be neat to be able to generate a
substantial cache that could be used for multiple autotools projects,
so compiles could be performed with users don’t have access to the
newer runtime system.

This sounds like a nice idea in concept, but unfortunately the
config.cache files are not meant to be shared between different
packages. This has been tried before, and it inevitably leads
to disaster. Most obvious is the possibility of namespace
collision (two packages could use the same variable name for
totally different things) but more subtle issues can come up
too.

Setting specific cache variables to handle specific cases is
another story (and often necessary for cross builds).

Cheers,
Nick

Bob Friesenhahn

2016-03-04 21:57:59 UTC

Permalink

Post by Nick Bowler
This sounds like a nice idea in concept, but unfortunately the
config.cache files are not meant to be shared between different
packages. This has been tried before, and it inevitably leads
to disaster. Most obvious is the possibility of namespace
collision (two packages could use the same variable name for
totally different things) but more subtle issues can come up
too.

Cache values are often interdependent. The success of one depended on
another, or on a particular shell variable value when the associated
test was run. The linker and preprocessor search paths are a good
example of a dependency.

Bob

--
Bob Friesenhahn
***@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/
GraphicsMagick Maintainer, http://www.GraphicsMagick.org/

Mike Frysinger

2016-03-04 18:41:16 UTC

Permalink

Post by Sean Byland
1. Setting --build to a slightly different canonical name to the --build name, indicating to configure that it should do a cross-compile
2. Generating a config.cache on the Ivybridge compute node, which shares the majority of the file system with the Sandybridge system and can successfully execute everything. Then point configure on the sandybridge system at the cache generated while using the Ivybridge CPU.
3. Setup configure to use a test âlauncher,â so configure tests will be launched on the Ivybridge system.
I like option one because it seems to follow a use-case that the autotools were designed for but I seem to get regular failures because (for example) the packager will use AC_TRY_RUN without defining a sensible â[action-if-cross-compiling]â. I like that option three allows configureâs main code path to function as intended but strongly dislike that it requires use of an Broadwell CPU that wonât always be available for every build and would probably require hacking and/or a configure user to perform packager actions. Option two getâs test results from the desired execution environment, allows configure to run really fast, and only requires minimal use of the Ivybridge system. Is using the cache in this manner generally a bad idea?
Iâd also appreciate any general feedback, in the past my lack of autotools knowledge has led me to âfightâ things and so Iâd like to avoid deviating too far from what these tools were designed for.

seems like the only thing you need to do is properly set CFLAGS/CXXFLAGS.
if you want a build that'll work on all x86 systems, then use something
like:
./configure CFLAGS='-O2 -march=x86-64 ...' CXXFLAGS='-O2 -march=x86-64 ...'

no need to mess with --build or --host, or config.cache. i'm assuming
the configure script doesn't attempt to detect CPU extensions via some
RUN tests ... if it does, then the script should be fixed to do define
probing and/or add a configure flag to control them (like --disable-mmx).
-mike

Mike Frysinger

2016-03-04 20:50:45 UTC

Permalink

Thanks. Targeting the least common denominator ISA to get portable code
works well for many things but in this case Iâm curious about getting
better performance than portability.

that's not what you said. you said you wanted to build on a newer cpu
and execute on an older one. you really can't have it both ways: you
must pick a lowest common denominator (via the -march flag). there is
the -mtune flag to allow you to select insn scheduling and such, but
gcc won't generate incompatible code.

if you want to know more about what gcc offers with insn/isa selection,
you should ask on the gcc mailing lists:
https://gcc.gnu.org/ml/gcc-help/

this question isn't really relevant to autoconf/autotools
-mike