Christopher Baines
9f102dbd39
Add code to delete nars entries
2023-08-01 14:13:10 +01:00
Christopher Baines
7495085f63
Delete unreferenced derivations in batches
...
To avoid a long blocking query.
2023-08-01 10:16:31 +01:00
Christopher Baines
bbc53deb1f
Rewrite deleting unreferenced derivations
...
Use fibers more, leaning in on the non-blocking use of Squee for parallelism.
2023-07-25 17:57:00 +01:00
Christopher Baines
7251c7d653
Stop using a pool of threads for database operations
...
Now that squee cooperates with suspendable ports, this is unnecessary. Use a
connection pool to still support running queries in parallel using multiple
connections.
2023-07-10 18:56:31 +01:00
Christopher Baines
742949cc97
Improve data deletion
2023-07-01 12:01:13 +01:00
Christopher Baines
47c482bdcc
Set lock_timeout for some data deletion transactions
...
As these can cause deadlocks. This will probably cause errors, so some
retrying will need to be added.
2023-05-09 08:55:09 +01:00
Christopher Baines
4fa7a3601e
Include distribution counts table in data deletion
2023-04-07 11:21:28 +01:00
Christopher Baines
2d96fbff48
Speed up deleting blocked_builds entries
2023-02-27 22:52:43 +00:00
Christopher Baines
1bce38a69d
Move the delete-unreferenced-derivations advisory lock
...
To better prevent two processes running at the same time.
2023-02-27 22:48:54 +00:00
Christopher Baines
1266d3d336
Remove redundant postgresql connection when deleting derivations
2023-02-14 20:59:21 +00:00
Christopher Baines
ebbcf36dc4
Delete blocked_builds entries when deleting derivations
2023-02-14 20:10:44 +00:00
Christopher Baines
5874c4ee37
Delete git_branches entries
...
When deleting data for a branch.
2023-02-14 19:57:30 +00:00
Christopher Baines
9872367c01
Avoid errors dropping partition tables if they don't exist
2023-02-13 20:10:23 +00:00
Christopher Baines
078516e0ab
Improve dropping package_derivation_by_guix_revision_range partitions
2023-02-13 19:26:44 +00:00
Christopher Baines
38b3657013
Use advisory locks to avoid deadlocks during data deletion
...
In the case where multiple data deleting processes end up running at the same
time.
2022-11-28 10:26:46 +00:00
Christopher Baines
39487cd7e6
Improve deleting derivations
...
Drop the batch size to get rid of warnings about memory usage and improve the
logging by adding duration information.
2022-07-08 20:55:58 +01:00
Christopher Baines
22c2ed2fa7
Fix ambiguous id column in delete-guix-revisions query
2022-06-16 12:46:32 +01:00
Christopher Baines
754f64718f
Fix DELETE query in delete-revisions-from-branch
2022-06-16 12:38:51 +01:00
Christopher Baines
be45e4251e
Fix ambiguous id column in delete-from-git-commits
2022-06-16 12:30:08 +01:00
Christopher Baines
71aaf1016b
Remove duplicate AND from delete-from-git-commits query
2022-06-16 12:25:47 +01:00
Christopher Baines
64be52844e
Partition the package_derivations_by_guix_revision_range table
...
And create a proper git_branches table in the process.
I'm hoping this will help with slow deletions from the
package_derivations_by_guix_revision_range table in the case where there are
lots of branches, since it'll separate the data for one branch from another.
These migrations will remove the existing data, so
rebuild-package-derivations-table will currently need manually running to
regenerate it.
2022-05-23 19:10:25 +01:00
Christopher Baines
971a474f65
Update delete-unreferenced-derivations
...
To delete from latest_build_status as well.
2020-10-13 20:33:07 +01:00
Christopher Baines
f02c245652
Add another guard clause in to the data deletion code
...
I've see this error [1] which may relate to the derivation-output-details-id
not being a number, so this check should confirm if there is a issue.
1: Throw to key `psql-query-error' with args `(fatal-error "PGRES_FATAL_ERROR" "ERROR: invalid input syntax for integer: \"\"\n")'.
2020-10-10 13:34:54 +01:00
Christopher Baines
2c463fcdab
Guard against derivation IDs that aren't numbers
...
I saw an error suggesting that something came back that wasn't a number, and
this should give a more informative error.
2020-10-09 19:27:04 +01:00
Christopher Baines
062397e82b
Just use map rather than par-map& for deleting derivations
...
As I think par-map& is probably no faster.
2020-10-08 08:20:03 +01:00
Christopher Baines
936fda57c5
Make the derivation deletion batch size configurable
2020-10-08 07:52:03 +01:00
Christopher Baines
b540abaeba
Reduce the derivation deletion batch size
2020-10-08 07:49:28 +01:00
Christopher Baines
f68166514f
Actually delete more of the data for a revision
...
Previously the package_derivations table wasn't considered, which would mean
derivations would still be referenced. This commit fixes that, along with also
deleting unreferenced entries in some linter related tables.
2020-10-04 15:11:21 +01:00
Christopher Baines
48673b32cb
Fix delete-unreferenced-derivations
2020-10-04 13:23:15 +01:00
Christopher Baines
a24d3e934d
Extract out the ability to delete a range of commits
...
Some revisions have got disassociated from branches, probably because they
were associated with multiple branches in the first place. This should allow
deleting them.
2020-10-04 12:18:57 +01:00
Christopher Baines
e2e55c69de
Rework the shortlived PostgreSQL specific connection channel
...
In to a generic thing more like (ice-9 futures). Including copying some bits
from the (ice-9 threads) module and adapting them to work with this fibers
approach, rather than futures. The advantage being that using fibers channels
doesn't block the threads being used by fibers, whereas futures would.
2020-10-03 21:32:46 +01:00
Christopher Baines
470573b318
Delete derivation_source_files that are unreferenced
...
This will also delete unreferenced derivation_source_file_nars.
2020-10-02 20:15:23 +01:00
Christopher Baines
54654417a3
Delete derivations in parallel
...
In an attempt to make this faster.
2020-10-01 19:15:32 +01:00
Christopher Baines
16600b1a43
Remove the deleting derivations progress output
...
As this is harder to do when deleting derivations in parallel.
2020-10-01 19:14:56 +01:00
Christopher Baines
fb4c7ecd4c
Delete derivations through a channel
...
Not much different from before, but this will allow parallelising things.
2020-10-01 19:14:11 +01:00
Christopher Baines
3330f034a4
Remove a now redundant part of the maybe-delete-derivation query
...
As this is covered by the big query selecting the derivation ids.
2020-09-30 20:34:33 +01:00
Christopher Baines
d844b325e2
Stop recursing now that derivation deletion selection is smarter
...
As this probably won't help with performance.
2020-09-30 20:07:41 +01:00
Christopher Baines
47af6c9661
Attempt to speed up derivation deletion
...
Stop querying for the file-name, as it's unused. Rather than fetching all ids,
then looking at each to see if it can be deleted, do some imperfect but not
too slow checks in the initial query.
2020-09-30 19:38:56 +01:00
Christopher Baines
02681d7e7a
Fix delete builds for derivation output details set
2020-09-27 16:21:51 +01:00
Christopher Baines
5b13ee2251
Delete builds for unreferenced derivations
2020-09-27 11:11:02 +01:00
Christopher Baines
52a23a5333
Further data deletion improvements
2020-09-27 11:10:47 +01:00
Christopher Baines
65e8bf3f8d
Add delete-revisions-from-branch-except-most-recent-n
2020-09-26 19:38:56 +01:00
Christopher Baines
992a0af63e
Split off delete-revisions-from-branch from delete-data-for-branch
...
To support not deleting all of the revisions.
2020-09-26 18:23:21 +01:00
Christopher Baines
f11421824d
Add a helper procedure to delete data for deleted branches
2020-05-23 21:05:44 +01:00
Christopher Baines
ca0d3ee754
Stop using package_versions_by_guix_revision_range
...
It's been replaced by the package_derivations_by_guix_revision_range table.
2020-03-24 20:44:57 +00:00
Christopher Baines
9178bd51a9
Add a function to delete unreferenced derivations
2020-02-16 22:29:25 +00:00
Christopher Baines
b087cfca67
Define the code to delete data from non-master branches properly
2020-02-16 10:59:38 +00:00
Christopher Baines
773e5a9c38
Add a module to handle deleting data
...
This, along with the way of specifying which branches are processed is a way
to manage the data stored within the Guix Data Service.
This only goes so far, it doesn't delete derivations, but it does delete some
of the information related to a revision.
2020-02-15 11:36:31 +00:00