There's an issue where sometimes for i686-linux and armhf-linux, only a few
package derivations are computed.
This commit tries to simplify the code, and adds some conditional logging for
the guix package, which might help reveal what's going on.
This might be generally useful, but I've been looking at it as it offers a way
to try and improve query performance when you want to select all the
derivations related to the packages for a revision.
The data looks like this (for a specified system and target):
┌───────┬───────┐
│ level │ count │
├───────┼───────┤
│ 15 │ 2 │
│ 14 │ 3 │
│ 13 │ 3 │
│ 12 │ 3 │
│ 11 │ 14 │
│ 10 │ 25 │
│ 9 │ 44 │
│ 8 │ 91 │
│ 7 │ 1084 │
│ 6 │ 311 │
│ 5 │ 432 │
│ 4 │ 515 │
│ 3 │ 548 │
│ 2 │ 2201 │
│ 1 │ 21162 │
│ 0 │ 22310 │
└───────┴───────┘
Level 0 reflects the number of packages. Level 1 is similar as you have all
the derivations for the package origins. The remaining levels contain less
packages since it's mostly just derivations involved in bootstrapping.
When using a recursive CTE to collect all the derivations, PostgreSQL assumes
that the each derivation has the same number of inputs, and this leads to a
large overestimation of the number of derivations per a revision. This in turn
can lead to PostgreSQL picking a slower way of running the query.
When it's known how many new derivations you should see at each level, it's
possible to inform PostgreSQL this by using LIMIT's at various points in the
query. This reassures the query planner that it's not going to be handling
lots of rows and helps it make better decisions about how to execute the
query.
Generating system test derivations are difficult, since you generally need to
do potentially expensive builds for the system you're generating the system
tests for. You might not want to disable grafts for instance because you might
be trying to test whatever the test is testing in the context of grafts being
enabled.
I'm looking at skipping the system tests on data.guix.gnu.org, because they're
not used and quite expensive to compute.
This should help with query performance, as the recursive queries using
derivation_inputs and derivation_outputs are particularly sensitive to the
n_distinct values for these tables.
And remove the chunking of derivation lint warnings.
The derivation linter computes the derivation for each packages supported
systems, but there are two problems with the approach. By doing this for each
package in turn, it forces inefficient uses of caches, since most of the
cached data is only relevant to a single system. More importantly though,
because the work of checking one package is dependent on it's supported
systems, it's unpredictable how much work will happen, and this will tend to
increase as more packages support more systems.
I think especially because of this last point, it's not worth attempting to
keep running the derivation linter at the moment, because it doesn't seem
sustainable. I can't see an way to run it that's futureproof and won't break
at some point in the future when packages in Guix support more systems.
As I'm seeing the inferior process crash with [1] just after fetching the
derivation lint warnings.
This change appears to help, although it's probably just a workaround. When
there's more packages/derivations, the caches might need clearing while
fetching the derivation lint warnings, or this will need to be split across
multiple processes.
1: Too many heap sections: Increase MAXHINCR or MAX_HEAP_SECTS
These cached store connections have caches associated with them, that take up
lots of memory, leading to the inferior crashing. This change seems to help.
To the end of the main revision processing transaction.
Currently, I think there are issues when this query does update some builds,
as those rows in the build table remain locked until the end of the
transaction. This then causes build event submission to hang. Moving this part
of the revision loading process to the end of the transaction should help to
mitigate this.
Previously, duplicates could creep through if the duplicate wasn't exported,
and only found as a replacement. Now they're filtered out.
This isn't ideal, as duplicates aren't always mistakes, it would be useful
still to capture this package, but having multiple entries for the same
name+version causes the comparison functionality to break.
As I think some operations (like the database backup) can block the DROP
SEQUENCE bit, so at least this approach means that the main transaction should
commit and then the sequence is eventually dropped.
This code is a bit tricky, since it should be compatible with old and new guix
revisions. I think these changes stop computing package derivations for
invalid systems, while hopefully not breaking anything.
Start at least looking for package replacements, and storing the
details (particularly the derivation). I'm looking at doing this so that build
servers using the Guix Data Service can build these derivations.
I guess this is a good change in general, but this seems to avoid a long
stack, which when a linter crashes, and the inferior tries to return the
exception details, and apparently hang the inferior/client as the reply isn't
written/read.