Discussion:
m4_pattern_forbid
Matěj Týč
2015-03-12 22:36:04 UTC
Permalink
Hello,
I am doing some m4sugar programming and I found the m4_pattern_forbid
insufficient.
As we all know, there are many kinds of regular expression in different
programs and even on different OSs (grep, vim, Python, linux sed, osx
sed, ...). So my question is - where should I look to learn what kind of
regular expression I can specify to that macro?

Interestingly, I took a look at the autoconf/m4sugar/m4sugar.m4:2005 to
learn the definition and I was surprised by what I have found there:

# m4_pattern_forbid(ERE, [WHY])
# -----------------------------
# Declare that no token matching the forbidden extended regular
# expression ERE should be seen in the output unless...
m4_define([m4_pattern_forbid], [])

I was unable to find the true macro's definition anywhere, so here is
the second question: How (and where) is m4_pattern_forbid defined?
Eric Blake
2015-03-12 22:51:06 UTC
Permalink
Post by Matěj Týč
Hello,
I am doing some m4sugar programming and I found the m4_pattern_forbid
insufficient.
As we all know, there are many kinds of regular expression in different
programs and even on different OSs (grep, vim, Python, linux sed, osx
sed, ...). So my question is - where should I look to learn what kind of
regular expression I can specify to that macro?
It is an extended regular expression according to what perl understands
(which is similar to what 'grep -E' understands).
Post by Matěj Týč
Interestingly, I took a look at the autoconf/m4sugar/m4sugar.m4:2005 to
# m4_pattern_forbid(ERE, [WHY])
# -----------------------------
# Declare that no token matching the forbidden extended regular
# expression ERE should be seen in the output unless...
m4_define([m4_pattern_forbid], [])
I was unable to find the true macro's definition anywhere, so here is
the second question: How (and where) is m4_pattern_forbid defined?
That's exactly where it is defined in m4. What you are really looking
for is the code that traces all no-op uses of that definition and then
greps the output for those forbidden patterns. For that, look at
bin/autom4te.in, for 'sub warn_forbidden'. It is one of several m4sugar
constructs that is done by the autom4te perl wrapper that invokes m4 and
post-processes the output, rather than directly by m4.

Would you like to propose a documentation patch to make it clear that
the flavor of regex in use by this macro is what perl understands?
--
Eric Blake eblake redhat com +1-919-301-3266
Libvirt virtualization library http://libvirt.org
Matěj Týč
2015-03-12 23:33:36 UTC
Permalink
Post by Eric Blake
...
Would you like to propose a documentation patch to make it clear that
the flavor of regex in use by this macro is what perl understands?
Thank you very much, autoconf is indeed a deeply fascinating project!
The documentation patch is on schedule.
Matěj Týč
2015-03-18 00:02:07 UTC
Permalink
Post by Eric Blake
...
Would you like to propose a documentation patch to make it clear that
the flavor of regex in use by this macro is what perl understands?
I have done a bit of research on how m4_pattern_forbid works in order to
propose beneficial changes to the documentation and it seems to me that
there is something slightly more going on behind the scenes. I am more a
Python than Perl guy, so I was not able to figure stuff out from the
source code of autom4te though.

What is interesting: I wanted to point out that it should be more
appropriate to write m4_pattern_forbid([\bMACRO]) than ...([^MACRO]),
but I found out that it is not really the true:

cat << EOF | autom4te -l m4sugar
m4_pattern_forbid([^FOO])
m4_divert_push(1)dnl
FOO1
FOO2
hidden FOO3
hiddenFOO4
EOF

yields:
stdin:2: warning: prefer named diversions
FOO1
FOO2
hidden FOO3
-:1: error: possibly undefined macro: FOO1
If this token and others are legitimate, please use
m4_pattern_allow.
See the Autoconf documentation.
-:2: error: possibly undefined macro: FOO2
-:3: error: possibly undefined macro: FOO3

However, ^FOO should match only FOO2, there is a whitespace in case of
FOO1 and a WORD in case of FOO3, between the FOO and beginning of the
line. At least FOO4 is left alone.

So maybe getting involved \b is not needed under these circumstances?
Eric Blake
2015-03-18 00:06:37 UTC
Permalink
Post by Matěj Týč
What is interesting: I wanted to point out that it should be more
appropriate to write m4_pattern_forbid([\bMACRO]) than ...([^MACRO]),
The perl code in autom4te splits all words of the input into one word
per line before running the regex.
Post by Matěj Týč
cat << EOF | autom4te -l m4sugar
m4_pattern_forbid([^FOO])
m4_divert_push(1)dnl
FOO1
FOO2
hidden FOO3
so all three instances of these FOO match the pattern to be rejected
(after being rewritten as:
FOO1
FOO2
hidden
FOO3
before testing the regex)
Post by Matěj Týč
hiddenFOO4
and this does not.
Post by Matěj Týč
So maybe getting involved \b is not needed under these circumstances?
The accept/reject patterns operate on each word as if on the line by
themselves, so no need to use \b.
--
Eric Blake eblake redhat com +1-919-301-3266
Libvirt virtualization library http://libvirt.org
Loading...