In a basic regular expression, a dollar-sign only means end-of-string if
it appears at the end of the pattern, or (at the choice of the
implementation) at the end of a \(...\) subexpression.
This affects the package converters/help2man that uses a regular
expression containing a dollar in a non-final position. This regular
expression had not been detected as an identity substitution even though
it is one.
This isolates the tests from the test infrastructure and allows the test
infrastructure to create arbitrary files for its own purpose without
affecting any of the tests.
Files like Makefile.am and configure.ac are usually not used during a
build, therefore there's no point in checking these for shell portability
issues.
A commonly occuring scenario is that a package patches the configure
script, but that the corresponding configure.in contains shell code that
is not portable. In cases like these, configure.in is typically not used
during the build, therefore there is no need to check it for portability.
This also applies to all other combinations where a file is patched and
the corresponding file.in contains unportable shell code.
Having quotes around the sed commands does not change their meaning.
These quotes must not lead to syntax errors when parsing the shell
command. This happened in mk/subst.mk r1.91 because the double quote was
accidentally escaped.
The escaping inside the backticks had been wrong. Because of this,
parentheses and semicolons were interpreted as shell syntax.
Switching to $(...) command substitution removes the need for quoting
some of the characters and makes the whole command simpler to understand.
Doing the escaping for the backticks command properly would have involved
lots of special cases.
The $(...) command substitution was used sparingly in pkgsrc up to now
because some older or broken shells do not support it. Since these
shells do not end up as the shell that runs the commands from Makefiles,
that's not a problem.
Right now, the tests in regress/tools are a mixture of testing the pkgsrc
infrastructure in mk/tools and the tools provided by the platforms.
These don't go well together, therefore the tests for the platform tools
will be migrated here.
ShellCheck complained about a parse error in the empty functions, which
agrees with the shell grammar given in IEEE 1003.1, both the 2004 Edition
and the 2018 Edition.
To work properly, the $(...) should have been $$(...).
In pkgsrc the command substitution is usually done via `backticks`, for
compatibility with /bin/sh from Solaris. To fix the shell parse errors,
the special characters are properly escaped inside the command
substitution.
Since SUBST_FILTER_CMD is a shell command, it may contain arbitrary
characters. The condition in mk/subst.mk that tested whether
SUBST_FILTER_CMD was the default filter command was evaluated at run
time. In such an evaluation, the variables (lhs and rhs) are fully
expanded before parsing the condition. This means that these variables
must not contain quotes or unquoted condition operators.
Exactly this situation came up in one of the regression tests. The
quoted "0-9" was copied verbatimly into the condition, including the
quotes. The resulting condition was:
"tr -d "0-9"" == "LC_ALL=C /usr/bin/sed "
This produced a syntax error because of the left-hand side. Adding a :Q
modifier would have helped for the left-hand side, but this would have
been necessary for the right-hand side as well. Since an empty SUBST_SED
is defined not to "contain only identity substitutions", the first
condition can simply be removed.
The whole condition in the shell program had not worked anyway since it
expanded to either "[ true ]" or to "[ false ]", and both of these
commands exited successfully.
There are several cases where patterns like s|man|${PKGMANDIR}| appear in
SUBST_SED. Up to now, these had been categorized as no-ops and required
extra code to make the package build when SUBST_NOOP_OK was set to "no".
This was against the original intention of SUBST_NOOP_OK, which was to
find outdated substitution patterns that do not occur in SUBST_FILES
anymore, most often because the packages have been updated since.
The identity substitutions do appear in the files, they just don't change
them. Typical cases are for PKGMANDIR, DEVOSSAUDIO, PREFIX, and these
variables may well be different in another pkgsrc setup. These patterns
are therefore excluded from the SUBST_NOOP_OK check.
https://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap09.html
says in section 9.3.2 BRE Ordinary Characters that only very few
characters may be preceded with a backslash.
As a side effect, this change allows parentheses in the variable names
listed in SUBST_VARS (even if that will never happen in practice).
The reason that the regression test had not replaced VAR.[] before was
simply that this variable had not been listed in SUBST_VARS.
The documentation now starts with a high-level introduction instead of
listing only the details.
The name of the function test_file had to be changed since the old name
was not expressive enough. Same for the variable real_pkgsrcdir.
These tests demonstrate that it is not easy to exclude only one top-level
directory from being extracted, using the example of lang/gcc*, which has
a top-level directory contrib/ that contains shell programs with
non-portable code, but the same archive also contains libjava/contrib, and
that should still be extracted.
The severity now depends only on the setting of SUBST_NOOP_OK. Right now
this means that some former warnings will be reported as info only, but
that will change after switching the default of SUBST_NOOP_OK after
2020Q1. Then they will all be reported as warnings, followed by the final
error saying that the pattern has no effect.
This change makes it easier to detect inconsistencies and outdated
definitions, for example by setting the global SUBST_NOOP_OK=no and
redefining WARNING_MSG to actuall fail.
In the old test code, the input and output data for each test case were
in different files. This was too far apart to relate them. In addition,
all test cases were merged into a single big test, which made it hard to
tell the topics apart.
Before, variables containing dollar characters displayed so wrong that it
was hard to explain.
To fix the problem, I typed almost random characters into the code until
the output was exactly as expected. I still do not understand:
* why the list variables need 8 dollars to survive the @x@ loop,
* why the code only works if the dollars come from an external variable
instead of being written inline,
* why the backslash in the :C modifier needs to be doubled.
Anyway, the output of "bmake show-all-extract" now contains the shell
variable $${extract_file}, just as it should. The dollars are now doubled
in the output and thereby match the source code from the Makefile
exactly.
A real-life example of this pattern is EXTRACT_CMD_DEFAULT, which refers
to DOWNLOADED_DISTFILE, which is defined as "$${extract_file}".
Until mk/extract/extract.mk r1.40, that argument was discarded from the
output completely. With the surrounding quotes at least the quotes are
visible in the output of "bmake show-all-extract".
The default value of SUBST_MESSAGE is based on SUBST_FILES, and that
variable may use the :sh modifier to list files from WRKSRC, which may
not exist at load time.
Before, the first assertion failure quit immediately. This prevented
getting a complete picture of the situation that failed. Now the
assertions continue the test and fail at the very end.
In the case of pkglocaledir, the SUBST_FILES are generated by a shell
command. That command assumes that the WRKDIR already exists. Therefore
SUBST_FILES must be evaluated as late as possible.
See mk/configure/replace-localedir.mk; an example package that fails is
devel/gettext-tools.
Before, not only files containing an RCS Id were recorded in the
+BUILD_VERSION file but also files containing text that looked similar to
an RCS Id were recorded, even though these didn't contain any valuable
version information.
The effect was that before this change, pkgtools/pkglint was built over
and over again by the bulk builds since pbulk uses a different regular
expression for detecting modified files.
The regular expression for unexpanded RCS Ids is added to record files
that have never been checked in to CVS, just to have them recorded and to
distinguish them from the final committed version.
See https://mail-index.netbsd.org/tech-pkg/2020/01/11/msg022489.html.
Before, not only files containing an RCS Id were recorded in the
+BUILD_VERSION file but also files containing text that looked similar to
an RCS Id were recorded, even though these didn't contain any valuable
version information.
The effect was that before this change, pkgtools/pkglint was built over
and over again by the bulk builds since pbulk uses a different regular
expression for detecting modified files.
The regular expression for unexpanded RCS Ids is added to record files
that have never been checked in to CVS, just to have them recorded and to
distinguish them from the final committed version.
See https://mail-index.netbsd.org/tech-pkg/2020/01/11/msg022489.html.
In pkgtools/pkglint, there are several lines that look almost like RCS
Ids. Some parts of the pkgsrc infrastructure expand them and some others
don't. This needs to be fixed so that all parts of pkgsrc agree what is a
complete RCS Id and what isn't.
As long as that is not the case, pbulk unnecessarily builds pkglint over
and over again, even if nothing changed. There are probably other
unintended side effects as well that just haven't been discovered or
considered grave enough.
The regress/ directory does not contain pkgsrc packages, therefore it
should not be listed as a SUBDIR in the top-level Makefile.
This wrong impression could be caused because most of the regression
tests have a Makefile that looks like an actual package Makefile. But
this alone doesn't mean that these are packages. The only relevant file
for a regression test is the spec file. If that test uses a package
Makefile or not is an implementation detail of each test.
The CPPFLAGS, CFLAGS, CXXFLAGS and LDFLAGS differ between the build phase
and the install phase. It's only a minor difference but may still
influence packages that use these flags at install time, even though they
shouldn't.
For now just document that the flags differ.
The variable CHECK_PERMS_AUTOFIX has been existing since 2006 but is not
used in any package. This may be because it is not helpful in any way.
When a package sets this variables to yes, the permission errors are not
silently fixed, but the build still fails instead. This behavior is not
useful in any way and thus needs to be fixed.
See https://mail-index.netbsd.org/tech-pkg/2019/08/thread1.html#021828
Before, the filename "3270" was wrongly replaced with "${PYVERSSUFFIX}"
since the version number "3.7", when interpreted as a regular expression,
matched that filename.
Before, the tool arguments were written to the log as plain strings. Now
the arguments are properly quoted, which makes it possible to replay the
commands by copying them from the .work.log file.
This only affects tools that are shell builtins (echo, true, false), get
additional arguments (mkdir -p) or define a custom TOOLS_SCRIPT
(pkg-config, to set an environment variable; or autotools). Tools that
are symlinked to the real tool are not affected.
The calls to the compiler are already properly logged since cwrappers
takes care of that. This commit therefore makes the log entries for the
compilers and the other tools more similar.
When a package or the infrastructure defined a tool with custom
TOOLS_ARGS or TOOLS_SCRIPT containing special characters, these could
lead to unintuitive interactions at the time when that tool invocation
was logged in the tool wrapper log. Some of the logging output ended up
on stdout, while some of the normal output ended up in the log, and parts
of the quoted arguments were even evaluated as shell commands.
The logging of the wrapped tool commands is not perfect yet, but at least
it's much more predictable now.