Commit graph

19 commits

Author SHA1 Message Date
Florian Klink
28f5c13c53 fix(tvix/castore): don't emit ret as INFO
This otherwise gets a bit spammy.

Change-Id: I288350a600d79a394c239f253424ad55bc3cefc5
Reviewed-on: https://cl.tvl.fyi/c/depot/+/10954
Tested-by: BuildkiteCI
Reviewed-by: tazjin <tazjin@tvl.su>
2024-02-18 07:12:27 +00:00
Ryan Lahfa
68bba48d59 feat(tvix/castore): process_entry cannot process unsupported nodes
In the past, we had a `todo!` on unsupported node types, this returns a proper error
that can be caught by the caller.

Change-Id: Icba4c1dab33c0d670a97f162c9b358d1ed5855cb
Reviewed-on: https://cl.tvl.fyi/c/depot/+/10675
Tested-by: BuildkiteCI
Reviewed-by: flokli <flokli@flokli.de>
2024-01-22 14:15:42 +00:00
Ryan Lahfa
7275288f0e refactor(tvix/castore): break down ingest_path
In one function that does the heavy lifting: `ingest_entries`, and three additional helpers:

- `walk_path_for_ingestion` which perform the tree walking in a very naive way and can be replaced by the user
- `leveled_entries_to_stream` which transforms a list of a list of
  entries ordered by their depth in the tree to a stream of entries in
  the bottom to top order (Merkle-compatible order I will say in the
  future).
- `ingest_path` which calls the previous functions.

Change-Id: I724b972d3c5bffc033f03363255eae448f017cef
Reviewed-on: https://cl.tvl.fyi/c/depot/+/10573
Tested-by: BuildkiteCI
Reviewed-by: flokli <flokli@flokli.de>
Autosubmit: raitobezarius <tvl@lahfa.xyz>
2024-01-20 18:26:17 +00:00
Ryan Lahfa
1f1a42b4da feat(tvix/castore): ingestion does DFS and invert it
To make use of the filtering feature, we need to revert the internal walker to a real DFS.

We will therefore just invert the whole tree by storing all of its
contents in a level-keyed vector.

This is horribly expensive in memory, this is a compromise between CPU
and memory, here is the fundamental reason for why:

When you encounter a directory, it's either a leaf or not, i.e. it
contains subdirectories or not.

To know this fact, you can:

- wait until you notice subdirectories under it, i.e. you need to store
  any intermediate nodes you see in the meantime -> memory penalty.
- getdents or readdir on it to determine *NOW* its subdirectories -> CPU
  penalty and I/O penalty.

This is an implementation of the first proposal, we pay memory.

In practice, we are paying O(#nb of nodes) in memory.

There's a smarter albeit much more complicated algorithm that pays only
O(\sum_i #siblings(p_i)) nodes where (p_1, ..., p_n) is the path to a leaf.

which means for:

             A
            / \
           B   C
          /   / \
         D   E   F

We would never store D, E, F but only E, F at a given time.
But we would still store B, C no matter what.

Change-Id: I456ed1c3f0db493e018ba1182665d84bebe29c11
Reviewed-on: https://cl.tvl.fyi/c/depot/+/10567
Tested-by: BuildkiteCI
Autosubmit: raitobezarius <tvl@lahfa.xyz>
Reviewed-by: flokli <flokli@flokli.de>
2024-01-20 17:16:01 +00:00
Ryan Lahfa
93afc711f6 feat(tvix/castore): convert import error to std::io::Error
So that we can just `map_err` easily in functions returning `std::io::Error` but calling functions
returning `castore::import::Error`.

Change-Id: Id181b95e8431c69e95f3a8cd569ca10306656e1d
Reviewed-on: https://cl.tvl.fyi/c/depot/+/10572
Autosubmit: raitobezarius <tvl@lahfa.xyz>
Reviewed-by: flokli <flokli@flokli.de>
Tested-by: BuildkiteCI
2024-01-18 14:40:06 +00:00
Florian Klink
89882ff9b1 refactor(tvix): use AsRef<dyn …> instead of Deref<Target= …>
Removes some more needs for Arcs.

Change-Id: I9a9f4b81641c271de260e9ffa98313a32944d760
Reviewed-on: https://cl.tvl.fyi/c/depot/+/10578
Autosubmit: flokli <flokli@flokli.de>
Tested-by: BuildkiteCI
Reviewed-by: raitobezarius <tvl@lahfa.xyz>
2024-01-09 14:08:22 +00:00
Ryan Lahfa
cbcd078684 chore(tvix/castore): fix the docstring for process_entry
It was a `//` not a `///`.

Change-Id: Iee3e8c116d73b5dd8a41c027153714415a66695f
Reviewed-on: https://cl.tvl.fyi/c/depot/+/10566
Tested-by: BuildkiteCI
Reviewed-by: flokli <flokli@flokli.de>
2024-01-06 01:39:41 +00:00
Florian Klink
f20969de9b refactor(tvix/castore): relax trait bounds for DS
Make this an `AsRef<dyn DirectoryService>`.

This helps dropping some Clone requirements.

Unfortunately, we can't thread this through to TvixStoreIO just yet.

Change-Id: I3f07eb28d6c793d3313fe21506ada84d5a8aa3ac
Reviewed-on: https://cl.tvl.fyi/c/depot/+/10533
Autosubmit: flokli <flokli@flokli.de>
Tested-by: BuildkiteCI
Reviewed-by: raitobezarius <tvl@lahfa.xyz>
2024-01-05 16:43:34 +00:00
Florian Klink
ddae4860c2 feat(tvix/castore/import): generalize ingest_path
We don't actually care if it's an Arc<dyn BlobService>, or something
else, as long as we can Deref to a BlobService and clone.

Change-Id: I0852aaf723f51c5e6b820be8db1199d17309ab08
Reviewed-on: https://cl.tvl.fyi/c/depot/+/10510
Reviewed-by: raitobezarius <tvl@lahfa.xyz>
Autosubmit: flokli <flokli@flokli.de>
Tested-by: BuildkiteCI
2024-01-01 00:54:14 +00:00
Florian Klink
81ef26ba3f fix(tvix/castore/import): don't unwrap entry
If the path specified doesn't exist, construct a proper error instead
of panicking.

Part of b/344.

Change-Id: Id5c6a91248b0a387f3e8f138f8e686e402009e8f
Reviewed-on: https://cl.tvl.fyi/c/depot/+/10330
Autosubmit: flokli <flokli@flokli.de>
Reviewed-by: raitobezarius <tvl@lahfa.xyz>
Tested-by: BuildkiteCI
2023-12-12 18:07:11 +00:00
Florian Klink
afd09c3290 feat(tvix/castore/import): log returned errors
This will emit a log event / trace in case this function returns an
error-y type.

Change-Id: I48db6807f3e42304357c422a2b6e177cb8b95228
Reviewed-on: https://cl.tvl.fyi/c/depot/+/10329
Autosubmit: flokli <flokli@flokli.de>
Reviewed-by: raitobezarius <tvl@lahfa.xyz>
Tested-by: BuildkiteCI
2023-12-12 18:07:11 +00:00
Florian Klink
30d82efa77 refactor(tvix/castore/blobservice): use io::Result in trait
For all these calls, the caller has enough context about what it did, so
it should be fine to use io::Result here.

We pretty much only constructed crate::Error::StorageError before
anyways, so this conveys *more* information.

Change-Id: I5cabb3769c9c2314bab926d34dda748fda9d3ccc
Reviewed-on: https://cl.tvl.fyi/c/depot/+/10328
Reviewed-by: raitobezarius <tvl@lahfa.xyz>
Tested-by: BuildkiteCI
Autosubmit: flokli <flokli@flokli.de>
2023-12-12 18:06:40 +00:00
sterni
875bb26fc3 fix(tvix/castore): correctly flag unreachable code
Change-Id: Id09afa4b77c3c70fb5695f253f6df4aa88b61e19
Reviewed-on: https://cl.tvl.fyi/c/depot/+/10113
Reviewed-by: flokli <flokli@flokli.de>
Tested-by: BuildkiteCI
2023-11-24 23:36:15 +00:00
Florian Klink
2546446d51 feat(tvix/castore): bump [Directory,File]Node size to u64
Having more than 4GiB files is quite possible (think about the NixOS
graphical installer, and an uncompressed iso of it).

No wire format changes.

Change-Id: Ia78a07e4c554e91b93c5b9f8533266e4bd7f22b6
Reviewed-on: https://cl.tvl.fyi/c/depot/+/9950
Reviewed-by: tazjin <tazjin@tvl.su>
Tested-by: BuildkiteCI
2023-11-05 10:57:01 +00:00
Connor Brewster
0325ae3ba3 fix(tvix/castore): Fix race when ingesting into castore
After finishing the ingestion, the directory putter was not being
closed. This caused a race where the root directory node was accessed
before the directory node had been flushed to the server.

This patch makes it so we close the putter before returning the root
node which should ensure that the root node exists on the directory
service server before the `ingest_path` function returns.

Fixes b/326

Change-Id: Id16cf46bc48962121dde76d3c9c23a845d87d0f1
Reviewed-on: https://cl.tvl.fyi/c/depot/+/9761
Tested-by: BuildkiteCI
Reviewed-by: flokli <flokli@flokli.de>
2023-10-17 13:01:29 +00:00
Linus Heckemann
b1ab8075cd docs(tvix/castore): point out use of contents_first
Change-Id: I7620d2abe01675ea7028a478d4f8447e36d5768b
Reviewed-on: https://cl.tvl.fyi/c/depot/+/9605
Tested-by: BuildkiteCI
Reviewed-by: flokli <flokli@flokli.de>
2023-10-13 17:14:48 +00:00
Florian Klink
1629f3064f docs(tvix/castore): remove TODO
This probably was about passing around directory_putter at some point,
which we do, so whatever this meant, it's not actionable anymore.

Change-Id: I1b4e0cdd2119bf2b2a9cf06d186a3b476b0ff367
Reviewed-on: https://cl.tvl.fyi/c/depot/+/9573
Reviewed-by: Linus Heckemann <git@sphalerite.org>
Autosubmit: flokli <flokli@flokli.de>
Tested-by: BuildkiteCI
2023-10-08 16:00:44 +00:00
edef
9a7c078a69 fix(tvix/castore): explicitly name lifetimes in process_entry
Otherwise this produces absolutely inscrutable errors:

    note: hidden type `[async fn body@castore/src/import.rs:63:1: 63:94]` captures lifetime '_#24r

Change-Id: If5d9626c9edf400de5bcec038bcaa5a3117561f0
Reviewed-on: https://cl.tvl.fyi/c/depot/+/9506
Tested-by: BuildkiteCI
Autosubmit: edef <edef@edef.eu>
Reviewed-by: flokli <flokli@flokli.de>
2023-10-04 08:31:40 +00:00
Florian Klink
32f41458c0 refactor(tvix): move castore into tvix-castore crate
This splits the pure content-addressed layers from tvix-store into a
`castore` crate, and only leaves PathInfo related things, as well as the
CLI entrypoint in the tvix-store crate.

Notable changes:
 - `fixtures` and `utils` had to be moved out of the `test` cfg, so they
   can be imported from tvix-store.
 - Some ad-hoc fixtures in the test were moved to proper fixtures in the
   same step.
 - The protos are now created by a (more static) recipe in the protos/
   directory.

The (now two) golang targets are commented out, as it's not possible to
update them properly in the same CL. This will be done by a followup CL
once this is merged (and whitby deployed)

Bug: https://b.tvl.fyi/issues/301

Change-Id: I8d675d4bf1fb697eb7d479747c1b1e3635718107
Reviewed-on: https://cl.tvl.fyi/c/depot/+/9370
Reviewed-by: tazjin <tazjin@tvl.su>
Reviewed-by: flokli <flokli@flokli.de>
Autosubmit: flokli <flokli@flokli.de>
Tested-by: BuildkiteCI
Reviewed-by: Connor Brewster <cbrewster@hey.com>
2023-09-22 12:51:21 +00:00
Renamed from tvix/store/src/import.rs (Browse further)