Now that squee cooperates with suspendable ports, this is unnecessary. Use a
connection pool to still support running queries in parallel using multiple
connections.
In to two thread pools, a default one, and one reserved for essential
functionality.
There are some pages that use slow queries, so this should help stop those
pages block other operations.
This might be generally useful, but I've been looking at it as it offers a way
to try and improve query performance when you want to select all the
derivations related to the packages for a revision.
The data looks like this (for a specified system and target):
┌───────┬───────┐
│ level │ count │
├───────┼───────┤
│ 15 │ 2 │
│ 14 │ 3 │
│ 13 │ 3 │
│ 12 │ 3 │
│ 11 │ 14 │
│ 10 │ 25 │
│ 9 │ 44 │
│ 8 │ 91 │
│ 7 │ 1084 │
│ 6 │ 311 │
│ 5 │ 432 │
│ 4 │ 515 │
│ 3 │ 548 │
│ 2 │ 2201 │
│ 1 │ 21162 │
│ 0 │ 22310 │
└───────┴───────┘
Level 0 reflects the number of packages. Level 1 is similar as you have all
the derivations for the package origins. The remaining levels contain less
packages since it's mostly just derivations involved in bootstrapping.
When using a recursive CTE to collect all the derivations, PostgreSQL assumes
that the each derivation has the same number of inputs, and this leads to a
large overestimation of the number of derivations per a revision. This in turn
can lead to PostgreSQL picking a slower way of running the query.
When it's known how many new derivations you should see at each level, it's
possible to inform PostgreSQL this by using LIMIT's at various points in the
query. This reassures the query planner that it's not going to be handling
lots of rows and helps it make better decisions about how to execute the
query.
Generating system test derivations are difficult, since you generally need to
do potentially expensive builds for the system you're generating the system
tests for. You might not want to disable grafts for instance because you might
be trying to test whatever the test is testing in the context of grafts being
enabled.
I'm looking at skipping the system tests on data.guix.gnu.org, because they're
not used and quite expensive to compute.
I think the idle connections associated with idle threads are still taking up
memory, so especially now that you can configure an arbitrary number of
threads (and thus connections), I think it's good to close them regularly.
The server part of the guix-data-service doesn't work great as a guix service,
since it often fails to start if the migrations take any time at all.
To address this, start the server before running the migrations, and serve the
pages that work without the database, plus a general 503 response. Once the
migrations have completed, switch to the normal behaviour.
This was good in that it avoided having to deal with long running connections,
but it probably takes some time to open the connection, and these changes are
a step towards offloading the PostgreSQL queries to other threads, so they
don't block the threads for fibers.
From the normalized one, to the one actually contained within glibc. Recent
versions of glibc also contain symlinks linking the normalized codeset to the
locales with the .UTF-8 ending, but older ones do not.
Maybe handling codeset normalisation for queries would be good, but the locale
values ending in .UTF-8 are more compatible and allow the code to be
simplified. For querying, maybe there should be a locales table which handles
different representations.
Turns out, at the moment, this is ineffective when combined with the archive
formats, like the custom format in use. Therefore, move it to the pg_restore
command, where hopefully it'll work.
This is a comprimise, as this won't help restoring the backup in situations
you want tablespaces, but I'm currently viewing tablespaces as a deployment
concern, so maybe the right thing to do is exclude them. This approach will at
least keep the same behaviour in terms of restoring the backups locally.
This will fix the small dump creation process on data.guix.gnu.org, which is
currently broken because of the tablespace assignments when trying to restore
the backups.
These are related things, but somewhat separate. This change should make it
easier to deal with changes regarding querying build servers, and querying
substitute servers.
This is better than just deleting the entries that don't match up with the
remaining revisions, but also not very useful for local development (due to
the lack of data).
Hopefully this will help with the pg_restore in the create-small-backup
script:
pg_restore: [archiver (db)] Error while PROCESSING TOC:
pg_restore: [archiver (db)] Error from TOC entry 2875; 0 0 COMMENT EXTENSION plpgsql
pg_restore: [archiver (db)] could not execute query: ERROR: must be owner of extension plpgsql
Command was: COMMENT ON EXTENSION plpgsql IS 'PL/pgSQL procedural language';