224 lines
9.4 KiB
Markdown
224 lines
9.4 KiB
Markdown
# Ghee: that thin layer of data change management over Btrfs, using Linux extended attributes (xattrs)
|
|
|
|
The tastiest way to manage data using Linux extended attributes (xattrs), written in pure Rust
|
|
and made of delicious, fibrous Open Source code.
|
|
|
|
Ghee provides tools for manipulating xattrs on individual files as well as for working with the filesystem as a document database
|
|
where the filesystem paths act as primary keys and extended attributes act as fields.
|
|
|
|
Data in Ghee tables can be managed using Git-style commits, implemented as Btrfs subvolumes and read-only snapshots.
|
|
|
|
In this way Ghee leverages the copy-on-write (CoW) nature of Btrfs to efficiently store deltas, even on large binary blobs.
|
|
|
|
## License
|
|
|
|
This software is licensed under GPL version 3 only.
|
|
|
|
## Conventions
|
|
|
|
Extended attribute names are parsed in a consistent manner by Ghee. Any xattr _not_ preceded by the `trusted`, `security`, `system`, or `user`
|
|
namespace will have the `user` namespace by default. For example, xattr `trusted.uptime` remains as is, while `uptime` would become
|
|
`user.uptime`.
|
|
|
|
Extended attribute values are parsed as `f64` numbers if possible; otherwise, they are interpeted as strings.
|
|
|
|
## REPL
|
|
|
|
Running `ghee` with no arguments will enter a read-eval-print-loop (REPL), allowing for fluent command input:
|
|
|
|
```
|
|
$ ghee
|
|
Ghee 0.6.0
|
|
|
|
ghee$ set ./test -s test=1
|
|
```
|
|
etc.
|
|
|
|
## Subcommands
|
|
|
|
Ghee operates through a set of subcommands, each with a primary function. Run `ghee --help` to list them,
|
|
and `ghee $SUBCMD --help` to get usage information for each subcommand.
|
|
|
|
Examples of each subcommand follow:
|
|
|
|
### Move
|
|
|
|
Moves xattr values from one path to another.
|
|
|
|
* `ghee mv path1.txt path2.txt`: move all xattrs from path1.txt to path2.txt
|
|
* `ghee mv -f id path1.txt path2.txt`: move xattr `id` from path1.txt to path2.txt
|
|
* `ghee mv -f id -f url path1.txt path2.txt`: move xattrs `id` and `url` from path1.txt to path2.txt
|
|
|
|
### Copy
|
|
|
|
Copies xattr values from one path to another.
|
|
|
|
* `ghee cp path1.txt path2.txt`: copy all xattrs from path1.txt to path2.txt
|
|
* `ghee cp -f id path1.txt path2.txt`: copy xattr `id` from path1.txt to path2.txt
|
|
* `ghee cp -f id -f url path1.txt path2.txt`: copy xattrs `id` and `url` from path1.txt to path2.txt
|
|
|
|
### Remove
|
|
|
|
Removes xattr values, recursively by default.
|
|
|
|
* `ghee rm path.txt`: remove all xattrs on path.txt
|
|
* `ghee rm dir`: remove all xattrs from dir and all descendants
|
|
* `ghee rm --flat dir`: remove all xattrs from dir only, not its descendants
|
|
* `ghee rm -f id path.txt`: remove xattr `id` from path.txt
|
|
* `ghee rm -f id -f url path1.txt path2.txt path3.txt`: remove xattrs `id` and `url` from path1.txt, path2.txt, and path3.txt
|
|
* `ghee rm -f name dir`: remove xattr `name` from dir and all its descendants
|
|
|
|
### Set
|
|
|
|
Sets xattr values, recursively by default.
|
|
|
|
* `ghee set -s id=123 path1.txt`: set xattr `id` to value `123` on path1.txt
|
|
* `ghee set -s name=Jama dir`: set xattr `name` to value `"Jama"` on dir and all descendants
|
|
* `ghee set -s name=Amira --flat dir`: set xattr `name` to value `"Amira"` on dir only, not its descendants
|
|
* `ghee set -s id=123 -s url=http://example.com path1.txt path2.txt path3.txt`: set xattr `id` to value `123` and xattr `url` to value `"http://example.com"` on path1.txt, path2.txt, and path3.txt
|
|
|
|
### Get
|
|
|
|
Recursively get and print xattr values for one or more paths.
|
|
|
|
By default, the `get` subcommand outputs a tab-separated table with a column order of `path`, `field`, `value`.
|
|
The value bytes are written to stdout as-is without decoding.
|
|
|
|
This excludes the `user.ghee` prefix unless `-a --all` is passed.
|
|
|
|
To opt out of the recursive default, use `--flat`.
|
|
|
|
* `ghee get dir`: print all xattrs for directory `dir` and all descendant files and directories, as raw (undecoded) TSV
|
|
* `ghee get -f id path1.txt`: print xattr `id` and its value on path1.txt as raw (undecoded) TSV
|
|
* `ghee get -f id -f url path1.txt path2.txt path3.txt`: print xattrs `id` and `url` and their respective values on path1.txt, path2.txt, and path3.txt as raw (undecoded) TSV
|
|
|
|
The `get` command can also output JSON - in which case values are decoded as UTF-8, filling in a default codepoint when decoding fails:
|
|
|
|
* `ghee get -j --flat dir`: print all xattrs for directory `dir` itself but not its descendants, as UTF-8 decoded JSON
|
|
* `ghee get -j -f id path1.txt`: print xattr `id` and its value on path1.txt as UTF-8 decoded JSON
|
|
* `ghee get -j -f id -f url path1.txt path2.txt path3.txt`: print xattrs `id` and `url` and their respective values on path1.txt, path2.txt, and path3.txt as JSON
|
|
|
|
By adding `--where` (or `-w`), SQL WHERE-style clauses can be provided to select which files to include in the output. For example,
|
|
`ghee get -w age >= 65 ./patients` will select all files under directory `./patients` whose `user.age` attribute is 65 or greater.
|
|
|
|
Nested indices are always ignored in `get` output, though they will be used as appropriate to shortcut traversal when WHERE-style
|
|
predicates are specified.
|
|
|
|
### Init
|
|
|
|
Initializes a directory as a table with a specified primary key, optionally inserting records from JSON where each line is
|
|
parsed independently---see `people.json` in the repository for an example.
|
|
|
|
Examples:
|
|
* `ghee init -k name ./people`: marks the `./people` directory as a table with primary key of `name`
|
|
* `ghee init -k state -k id ./people-by-state-and-id`: marks the `./people-by-state-and-id` directory as a table with a compound primary
|
|
key of [`state`, `id`].
|
|
* `ghee init -k sauce ./pizza < ./pizzas.json`: marks the `./pizza` directory as a table with primary key `sauce`, importing records from `./pizzas.json`
|
|
|
|
### Create
|
|
|
|
Exactly like `init`, but creates the directory first, or errors if it already exists.
|
|
|
|
### Insert
|
|
|
|
Inserts JSON-formatted records into a table.
|
|
|
|
Records are read one per line from stdin.
|
|
|
|
* `ghee ins ./people < ./people.json`: inserts the records from `./people.json` into the table at `./people`, indexed by its primary key
|
|
* `ghee ins ./people ./people.json`: same as the above, but not depending on the shell for redirection
|
|
|
|
### Delete
|
|
|
|
Deletes records from a table.
|
|
|
|
They are unlinked from all table indices.
|
|
|
|
The records to be deleted are specified by providing either the components of the primary key or SQL-style WHERE clauses.
|
|
|
|
* `ghee del ./people Von`: because the table's primary key is `name`, deletes the record where `name=Von` from `./people` and all
|
|
indices.
|
|
* `ghee del ./people -w name=Von`: deletes `./people/Von` as above, unlinking from all indices.
|
|
|
|
### Index
|
|
|
|
Indexes a table.
|
|
|
|
When Ghee acts on a directory as if it were a database table, each file acts as a relational "record" with the primary key inferred from
|
|
its subpath under the table directory.
|
|
|
|
Each file's extended attributes act as the relational attributes.
|
|
|
|
Table directories created by Ghee also contain a special xattr `user.ghee.tableinfo` which stores the primary key and related indices
|
|
(including itself) of a table.
|
|
|
|
If no index location is provided, it will be placed in a default path under the table being indexed.
|
|
|
|
Examples:
|
|
|
|
* `ghee idx -k name ./people ./people-by-name`: recursively reindex the contents of `./people` into a new directory `./people-by-name` with primary key
|
|
coming from xattr `name` and files hardlinked to the corresponding files in `./people`.
|
|
|
|
That means the `./people-by-name` directory's files will have filenames taken from the names of the people as defined in xattr `name`.
|
|
|
|
The `user.ghee.tableinfo` xattr for `./people` records `./people-by-name` as a related index, and the reciprocal is true as well:
|
|
the `user.ghee.tableinfo` xattr for `./people-by-name` records `./people` as a related index.
|
|
|
|
Queries such as `get` and `del` will be opportunistically accelerated using available indices.
|
|
|
|
|
|
* `ghee idx -k region -k name -s ./people-by-name ./people-by-region-and-name`: recursively reindex the contents of `./people-by-name` into a new directory
|
|
`./people-by-region-and-name` with primary key being the compound of xattr `region` and xattr `name` (in that order) and files hardlinked to the
|
|
corresponding files in `./people`, resolved via the hardlinks in `./people-by-name`.
|
|
|
|
The `user.ghee.tableinfo` xattrs of both directories will be updated to reflect their relationship.
|
|
|
|
### List
|
|
|
|
Like the `ls` command, lists directory contents, but annotated from Ghee's point of view.
|
|
|
|
Each path is marked as either a table or a record. For tables, the primary key is given.
|
|
|
|
* `ghee ls`: lists the current directory's contents
|
|
* `ghee ls example`: lists the contents of ./example
|
|
|
|
### Commit
|
|
|
|
Stores the current state of the table in a Btrfs snapshot, identified by a UUID.
|
|
|
|
Optionally, a message describing the changes made since the last snapshot (if any) can be provided.
|
|
|
|
* `ghee commit -m "Update README.md"`
|
|
|
|
The UUID of the commit is outputted.
|
|
|
|
### Log
|
|
|
|
Displays past commits.
|
|
|
|
* `ghee log`: Lists all past commits in the current table
|
|
|
|
### Touch
|
|
|
|
Similar to the Unix `touch` command, creates an empty file at the specified path;
|
|
if the path is part of a Ghee table, xattrs are inferred from the path and written
|
|
to the new file.
|
|
|
|
This is a convenient way to add new records to new tables.
|
|
|
|
With `-p / --parents`, parent directories will be created.
|
|
|
|
* `ghee touch /pizza/pepperoni`: creates an empty file at `/pizza/pepperoni`, setting
|
|
xattr `topping` to `pepperoni` because the key of the `/pizza` table is `topping`.
|
|
|
|
### Restore
|
|
|
|
Restores paths to their state in the `HEAD` commit.
|
|
|
|
* `ghee restore README.md`
|
|
|
|
### Reset
|
|
|
|
Resets all files in the table to their state in a specified commit.
|
|
|
|
* `ghee reset add133b4-f58b-a64e-992a-46f983a0e7ed` |