chore(tvix/castore): move data model docs to here
These describe the castore data model, so it should live in the castore crate. Also, some minor edits to //tvix/store/docs/api.md, to honor the move of the castore bits to tvix-castore. Change-Id: I1836556b652ac0592336eac95a8d0647599f4aec Reviewed-on: https://cl.tvl.fyi/c/depot/+/9893 Autosubmit: flokli <flokli@flokli.de> Reviewed-by: tazjin <tazjin@tvl.su> Tested-by: BuildkiteCI
This commit is contained in:
parent
d545f11819
commit
beae3a4bf1
3 changed files with 20 additions and 15 deletions
50
tvix/castore/docs/data-model.md
Normal file
50
tvix/castore/docs/data-model.md
Normal file
|
|
@ -0,0 +1,50 @@
|
|||
# Data model
|
||||
|
||||
This provides some more notes on the fields used in castore.proto.
|
||||
|
||||
See `//tvix/store/docs/api.md` for the full context.
|
||||
|
||||
## Directory message
|
||||
`Directory` messages use the blake3 hash of their canonical protobuf
|
||||
serialization as its identifier.
|
||||
|
||||
A `Directory` message contains three lists, `directories`, `files` and
|
||||
`symlinks`, holding `DirectoryNode`, `FileNode` and `SymlinkNode` messages
|
||||
respectively. They describe all the direct child elements that are contained in
|
||||
a directory.
|
||||
|
||||
All three message types have a `name` field, specifying the (base)name of the
|
||||
element (which MUST not contain slashes or null bytes, and MUST not be '.' or '..').
|
||||
For reproducibility reasons, the lists MUST be sorted by that name and also
|
||||
MUST be unique across all three lists.
|
||||
|
||||
In addition to the `name` field, the various *Node messages have the following
|
||||
fields:
|
||||
|
||||
## DirectoryNode
|
||||
A `DirectoryNode` message represents a child directory.
|
||||
|
||||
It has a `digest` field, which points to the identifier of another `Directory`
|
||||
message, making a `Directory` a merkle tree (or strictly speaking, a graph, as
|
||||
two elements pointing to a child directory with the same contents would point
|
||||
to the same `Directory` message.
|
||||
|
||||
There's also a `size` field, containing the (total) number of all child
|
||||
elements in the referenced `Directory`, which helps for inode calculation.
|
||||
|
||||
## FileNode
|
||||
A `FileNode` message represents a child (regular) file.
|
||||
|
||||
Its `digest` field contains the blake3 hash of the file contents. It can be
|
||||
looked up in the `BlobService`.
|
||||
|
||||
The `size` field contains the size of the blob the `digest` field refers to.
|
||||
|
||||
The `executable` field specifies whether the file should be marked as
|
||||
executable or not.
|
||||
|
||||
## SymlinkNode
|
||||
A `SymlinkNode` message represents a child symlink.
|
||||
|
||||
In addition to the `name` field, the only additional field is the `target`,
|
||||
which is a string containing the target of the symlink.
|
||||
57
tvix/castore/docs/why-not-git-trees.md
Normal file
57
tvix/castore/docs/why-not-git-trees.md
Normal file
|
|
@ -0,0 +1,57 @@
|
|||
## Why not git tree objects?
|
||||
|
||||
We've been experimenting with (some variations of) the git tree and object
|
||||
format, and ultimately decided against using it as an internal format, and
|
||||
instead adapted the one documented in the other documents here.
|
||||
|
||||
While the tvix-store API protocol shares some similarities with the format used
|
||||
in git for trees and objects, the git one has shown some significant
|
||||
disadvantages:
|
||||
|
||||
### The binary encoding itself
|
||||
|
||||
#### trees
|
||||
The git tree object format is a very binary, error-prone and
|
||||
"made-to-be-read-and-written-from-C" format.
|
||||
|
||||
Tree objects are a combination of null-terminated strings, and fields of known
|
||||
length. References to other tree objects use the literal sha1 hash of another
|
||||
tree object in this encoding.
|
||||
Extensions of the format/changes are very hard to do right, because parsers are
|
||||
not aware they might be parsing something different.
|
||||
|
||||
The tvix-store protocol uses a canonical protobuf serialization, and uses
|
||||
the [blake3][blake3] hash of that serialization to point to other `Directory`
|
||||
messages.
|
||||
It's both compact and with a wide range of libraries for encoders and decoders
|
||||
in many programming languages.
|
||||
The choice of protobuf makes it easy to add new fields, and make old clients
|
||||
aware of some unknown fields being detected [^adding-fields].
|
||||
|
||||
#### blob
|
||||
On disk, git blob objects start with a "blob" prefix, then the size of the
|
||||
payload, and then the data itself. The hash of a blob is the literal sha1sum
|
||||
over all of this - which makes it something very git specific to request for.
|
||||
|
||||
tvix-store simply uses the [blake3][blake3] hash of the literal contents
|
||||
when referring to a file/blob, which makes it very easy to ask other data
|
||||
sources for the same data, as no git-specific payload is included in the hash.
|
||||
This also plays very well together with things like [iroh][iroh-discussion],
|
||||
which plans to provide a way to substitute (large)blobs by their blake3 hash
|
||||
over the IPFS network.
|
||||
|
||||
In addition to that, [blake3][blake3] makes it possible to do
|
||||
[verified streaming][bao], as already described in other parts of the
|
||||
documentation.
|
||||
|
||||
The git tree object format uses sha1 both for references to other trees and
|
||||
hashes of blobs, which isn't really a hash function to fundamentally base
|
||||
everything on in 2023.
|
||||
The [migration to sha256][git-sha256] also has been dead for some years now,
|
||||
and it's unclear how a "blake3" version of this would even look like.
|
||||
|
||||
[bao]: https://github.com/oconnor663/bao
|
||||
[blake3]: https://github.com/BLAKE3-team/BLAKE3
|
||||
[git-sha256]: https://git-scm.com/docs/hash-function-transition/
|
||||
[iroh-discussion]: https://github.com/n0-computer/iroh/discussions/707#discussioncomment-5070197
|
||||
[^adding-fields]: Obviously, adding new fields will change hashes, but it's something that's easy to detect.
|
||||
Loading…
Add table
Add a link
Reference in a new issue