How to check for SIGABRT?

Discussion:

Simon Sobisch

2017-12-08 20:36:54 UTC

We use the autoconf generated testsuite script in GnuCOBOL to test the
compiler and runtime - and it works very well for "normal" tests.

There are some tests where the compiler should abort and it does, but
when it does so "correctly" by raising SIGABRT we can check for return
code 134 but get an additional stderr message similar to
"/full/path/to/testsuite.de/testcasenumber/run aborted on line 40" (and
I don't know if SIGABRT will result in return code 134 on all "exotic"
systems).

The following options come to mind:

* use `trap` in AT_CHECK to catch the error if it is the expected result
--> should fix the additional stderr, we still would have to check for
return code 134

* use some builtin expectation similar to XFAIL(true) - but I don't
found anything like this in the docs

* when running the testsuite (we have one case where we check this via
an environment variable set in atlocal and return 77 for skipping the
test if the compiler cannot use an external tool) don't raise SIGABRT
but something like `exit 96`

Do you have any experience/thoughts about this?
Is there any "best practice" (ideally as official documentation) for
checking signals in the autoconf generated testsuite script?

Thank you in advance,
Simon

Eric Blake

2017-12-08 22:25:02 UTC

Permalink

POSIX says that SIGABRT is 6 on XSI systems, but you can be compliant
with POSIX without being an XSI system (but non-XSI systems tend to not
run autoconf). However, your bigger problem is that on ksh, SIGABRT
shows up in $? as 262 (it specifically treats signals as 256+num instead
of 128+num, to make signals unambiguous with normal exit status).

Post by Simon Sobisch
* use `trap` in AT_CHECK to catch the error if it is the expected result
--> should fix the additional stderr, we still would have to check for
return code 134

Your trap could then exit (normally) with a known exit status, rather
than reusing the inherited status that may or may not be 134. Or even
write to a witness file, ignore the value of $? altogether, and do a
second AT_CHECK that the witness file was touched.

Post by Simon Sobisch
Do you have any experience/thoughts about this?
Is there any "best practice" (ideally as official documentation) for
checking signals in the autoconf generated testsuite script?

Expecting death to a signal is unusual, so I don't know of anyone else
doing it. But if it is a common enough paradigm, I agree that making it
easier in autotest to check that a process died due to a particular
signal may be a worthwhile enhancement.

--
Eric Blake, Principal Software Engineer
Red Hat, Inc. +1-919-301-3266
Virtualization: qemu.org | libvirt.org

Russ Allbery

2017-12-08 22:51:48 UTC

Permalink

Post by Simon Sobisch
We use the autoconf generated testsuite script in GnuCOBOL to test the
compiler and runtime - and it works very well for "normal" tests.
There are some tests where the compiler should abort and it does, but
when it does so "correctly" by raising SIGABRT we can check for return
code 134 but get an additional stderr message similar to
"/full/path/to/testsuite.de/testcasenumber/run aborted on line 40" (and
I don't know if SIGABRT will result in return code 134 on all "exotic"
systems).
* use `trap` in AT_CHECK to catch the error if it is the expected result
--> should fix the additional stderr, we still would have to check for
return code 134
* use some builtin expectation similar to XFAIL(true) - but I don't
found anything like this in the docs
* when running the testsuite (we have one case where we check this via
an environment variable set in atlocal and return 77 for skipping the
test if the compiler cannot use an external tool) don't raise SIGABRT
but something like `exit 96`
Do you have any experience/thoughts about this?

Personally, I'd make the actual test a tiny C program that execs argv,
waits for it to exit, and then inspects the resulting exit status to see
if it died with SIGABRT.

--
Russ Allbery (***@eyrie.org) <http://www.eyrie.org/~eagle/>

Bob Friesenhahn

2017-12-08 22:54:34 UTC

Permalink

Different shells and different OSs will return a different code.
There is no standard.

One thing you can do is to interject a tiny shim program which does
fork/exec to start the program being tested and then use wait(2) or
waitpid(2) to wait for the program to quit. You can also just use
system(3). You can then use macros such as WTERMSIG() to see what
signal caused the program to exit (if any). You shim program can
return codes that you expect.

The Linux manual page for waitid(2) documents the various macros and
how to use them.

Some systems interject their own core file handling and may put core
files in a special directory, use unexpected names for the core files,
or say "Your system has a problem". Producing core files in configure
scripts may be confusing for the user.

Bob

--
Bob Friesenhahn
***@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/
GraphicsMagick Maintainer, http://www.GraphicsMagick.org/