Commit graph

72 commits

Author SHA1 Message Date
Florian Klink
2beabe968c refactor(nix-compat/store_path): make StorePath generic on S
Similar to how cl/12253 already did this for `Signature`, we apply the
same logic to `StorePath`.

`StorePathRef<'a>'` is now a `StorePath<&'a str>`, and there's less
redundant code for the two different implementation.

`.as_ref()` returns a `StorePathRef<'_>`, `.to_owned()` gives a
`StorePath<String>` (for now).

I briefly thought about only publicly exporting `StorePath<String>`
as `StorePath`, but the diff is not too large and this will make it
easier to gradually introduce more flexibility in which store paths to
accept.

Also, remove some silliness in `StorePath::from_absolute_path_full`,
which now doesn't allocate anymore.

Change-Id: Ife8843857a1a0a3a99177ca997649fd45b8198e6
Reviewed-on: https://cl.tvl.fyi/c/depot/+/12258
Autosubmit: flokli <flokli@flokli.de>
Tested-by: BuildkiteCI
Reviewed-by: Connor Brewster <cbrewster@hey.com>
2024-08-20 15:14:27 +00:00
Florian Klink
c75b0d08ce chore(tvix/glue): drop some explicit allow(clippy::mutable_key_type)
This is covered by clippy.toml these days.

Change-Id: I2330af5781844d5f9d975793d770efcea48d371b
Reviewed-on: https://cl.tvl.fyi/c/depot/+/12223
Tested-by: BuildkiteCI
Autosubmit: flokli <flokli@flokli.de>
Reviewed-by: Connor Brewster <cbrewster@hey.com>
2024-08-17 15:59:10 +00:00
Florian Klink
21ceef4934 refactor(tvix/castore): add into_nodes(), implement consuming proto conv
Provide a into_nodes() function on a Directory, which consumes self and
returns owned PathComponent and Node.

Use it to provide a proper conversion from Directory to the proto
variant that doesn't clone.

There's no need for the one taking only &Directory, we don't use it
anywhere, and once someone needs that they might as well clone Directory
before converting it.

Update all other users of the `.nodes()` function to use `.into_nodes()`
where applicable, and avoid some more cloning there.

Change-Id: Id4577b9eb173c012e225337458898d3937112bcb
Reviewed-on: https://cl.tvl.fyi/c/depot/+/12218
Tested-by: BuildkiteCI
Autosubmit: flokli <flokli@flokli.de>
Reviewed-by: Connor Brewster <cbrewster@hey.com>
2024-08-17 15:59:10 +00:00
Florian Klink
5ec93b57e6 refactor(tvix/castore): add PathComponent type for checked components
This encodes a verified component on the type level. Internally, it
contains a bytes::Bytes.

The castore Path/PathBuf component() and file_name() methods now
return this type, the old ones returning bytes were renamed to
component_bytes() and component_file_name() respectively.

We can drop the directory_reject_invalid_name test - it's not possible
anymore to pass an invalid name to Directories::add.
Invalid names in the Directory proto are still being tested to be
rejected in the validate_invalid_names tests.

Change-Id: Ide4d16415dfd50b7e2d7e0c36d42a3bbeeb9b6c5
Reviewed-on: https://cl.tvl.fyi/c/depot/+/12217
Autosubmit: flokli <flokli@flokli.de>
Reviewed-by: Connor Brewster <cbrewster@hey.com>
Tested-by: BuildkiteCI
2024-08-17 15:59:10 +00:00
Florian Klink
8ea7d2b60e refactor(tvix/castore): drop {Directory,File,Symlink}Node
Add a `SymlinkTarget` type to represent validated symlink targets.
With this, no invalid states are representable, so we can make `Node` be
just an enum of all three kind of types, and allow access to these
fields directly.

Change-Id: I20bdd480c8d5e64a827649f303c97023b7e390f2
Reviewed-on: https://cl.tvl.fyi/c/depot/+/12216
Reviewed-by: benjaminedwardwebb <benjaminedwardwebb@gmail.com>
Autosubmit: flokli <flokli@flokli.de>
Reviewed-by: Connor Brewster <cbrewster@hey.com>
Tested-by: BuildkiteCI
2024-08-17 09:46:01 +00:00
Florian Klink
49b173786c refactor(tvix/castore): remove name from Nodes
Nodes only have names if they're contained inside a Directory, or if
they're a root node and have something else possibly giving them a name
externally.

This removes all `name` fields in the three different Nodes, and instead
maintains it inside a BTreeMap inside the Directory.

It also removes the NamedNode trait (they don't have a get_name()), as
well as Node::rename(self, name), and all [Partial]Ord implementations
for Node (as they don't have names to use for sorting).

The `nodes()`, `directories()`, `files()` iterators inside a `Directory`
now return a tuple of Name and Node, as does the RootNodesProvider.

The different {Directory,File,Symlink}Node struct constructors got
simpler, and the {Directory,File}Node ones became infallible - as
there's no more possibility to represent invalid state.

The proto structs stayed the same - there's now from_name_and_node and
into_name_and_node to convert back and forth between the two `Node`
structs.

Some further cleanups:

The error types for Node validation were renamed. Everything related to
names is now in the DirectoryError (not yet happy about the naming)

There's some leftover cleanups to do:
 - There should be a from_(sorted_)iter and into_iter in Directory, so
   we can construct and deconstruct in one go.
   That should also enable us to implement conversions from and to the
   proto representation that moves, rather than clones.

 - The BuildRequest and PathInfo structs are still proto-based, so we
   still do a bunch of conversions back and forth there (and have some
   ugly expect there). There's not much point for error handling here,
   this will be moved to stricter types in a followup CL.

Change-Id: I7369a8e3a426f44419c349077cb4fcab2044ebb6
Reviewed-on: https://cl.tvl.fyi/c/depot/+/12205
Tested-by: BuildkiteCI
Reviewed-by: yuka <yuka@yuka.dev>
Autosubmit: flokli <flokli@flokli.de>
Reviewed-by: benjaminedwardwebb <benjaminedwardwebb@gmail.com>
Reviewed-by: Connor Brewster <cbrewster@hey.com>
2024-08-17 09:45:58 +00:00
Florian Klink
c7845f3c88 refactor(tvix/castore): move *Node and Directory to crate root
*Node and Directory are types of the tvix-castore model, not the tvix
DirectoryService model. A DirectoryService only happens to send
Directories.

Move types into individual files in a nodes/ subdirectory, as it's
gotten too cluttered in a single file, and (re-)export all types from
the crate root.

This has the effect that we now cannot poke at private fields directly
from other files inside `crate::directoryservice` (as it's not all in
the same file anymore), but that's a good thing, it now forces us to go
through the proper accessors.

For the same reasons, we currently also need to introduce the `rename`
functions on each *Node directly.

A followup is gonna move the names out of the individual enum kinds, so
we can better represent "unnamed nodes".

Change-Id: Icdb34dcfe454c41c94f2396e8e99973d27db8418
Reviewed-on: https://cl.tvl.fyi/c/depot/+/12199
Reviewed-by: yuka <yuka@yuka.dev>
Autosubmit: flokli <flokli@flokli.de>
Tested-by: BuildkiteCI
2024-08-13 18:39:49 +00:00
Yureka
3ca0b53840 refactor(tvix/castore): use Directory struct separate from proto one
This uses our own data type to deal with Directories in the castore model.

It makes some undesired states unrepresentable, removing the need for conversions and checking in various places:

 - In the protobuf, blake3 digests could have a wrong length, as proto doesn't know fixed-size fields. We now use `B3Digest`, which makes cloning cheaper, and removes the need to do size-checking everywhere.
 - In the protobuf, we had three different lists for `files`, `symlinks` and `directories`. This was mostly a protobuf size optimization, but made interacting with them a bit awkward. This has now been replaced with a list of enums, and convenience iterators to get various nodes, and add new ones.

Change-Id: I7b92691bb06d77ff3f58a5ccea94a22c16f84f04
Reviewed-on: https://cl.tvl.fyi/c/depot/+/12057
Tested-by: BuildkiteCI
Reviewed-by: flokli <flokli@flokli.de>
2024-08-13 12:17:01 +00:00
Ilan Joselevich
d2a80dda88 feat(tvix/cli): Add derivation file dumping functionality
Provides a derivation file dumping functionality for tvix-cli that can
be used when passing the --drv-dumpdir CLI arg to tvix-cli.

This will dump all the known derivation files into the specified
directory, making it easier to debug derivation divergences between Tvix
generated drvs and the drvs generated by Nix.

Supersedes: https://cl.tvl.fyi/c/depot/+/11265

Change-Id: I0e10b26eba22032b84ac543af0d4150ad87aed3e
Reviewed-on: https://cl.tvl.fyi/c/depot/+/12192
Autosubmit: Ilan Joselevich <personal@ilanjoselevich.com>
Tested-by: BuildkiteCI
Reviewed-by: flokli <flokli@flokli.de>
2024-08-12 14:50:17 +00:00
Yureka
67335c41b7 refactor(tvix): move service addrs into shared clap struct
Change-Id: I7cab29ecfa1823c2103b4c47b7d784bc31459d55
Reviewed-on: https://cl.tvl.fyi/c/depot/+/12008
Tested-by: BuildkiteCI
Reviewed-by: flokli <flokli@flokli.de>
Autosubmit: yuka <yuka@yuka.dev>
2024-07-22 13:36:08 +00:00
Yureka
8b77c7fcd7 refactor(tvix/store): use composition in tvix_store crate
Change-Id: Ie6290b296baba2b987f1a61c9bb4c78549ac11f1
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11983
Reviewed-by: flokli <flokli@flokli.de>
Autosubmit: yuka <yuka@yuka.dev>
Tested-by: BuildkiteCI
2024-07-20 19:37:27 +00:00
Aspen Smith
dfe137786c refactor(tvix/eval): Builderize Evaluation
Make constructing of a new Evaluation use the builder pattern rather
than setting public mutable fields. This is currently a pure
refactor (no functionality has changed) but has a few advantages:

- We've encapsulated the internals of the fields in Evaluation, meaning
  we can change them without too much breakage of clients
- We have type safety that prevents us from ever changing the fields of
  an Evaluation after it's built (which matters more in a world where we
  reuse Evaluations).

More importantly, this paves the road for doing different things with
the construction of an Evaluation - notably, sharing certain things like
the GlobalsMap across subsequent evaluations in eg the REPL.

Fixes: b/262
Change-Id: I4a27116faac14cdd144fc7c992d14ae095a1aca4
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11956
Tested-by: BuildkiteCI
Autosubmit: aspen <root@gws.fyi>
Reviewed-by: flokli <flokli@flokli.de>
2024-07-06 15:03:46 +00:00
Florian Klink
540e566900 refactor(tvix/glue): take &CAHash, not CAHash
We use a bit less cloning that way.

Change-Id: I28bf99577e4a481e35fbf99d0724adab5502a1bd
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11874
Reviewed-by: Connor Brewster <cbrewster@hey.com>
Tested-by: BuildkiteCI
Reviewed-by: Ilan Joselevich <personal@ilanjoselevich.com>
2024-06-26 04:51:31 +00:00
Florian Klink
78eb22c54d feat(tvix/glue): handle regular file at builtins.path import
If builtins.path is passed a regular file, no filtering is applied.
We use the just-introduced file_type function in the EvalIO trait for
that.

This means, we don't need to pass through filtered_ingest, and can
assemble the FileNode directly in that specific match case.

This also means, we can explicitly calculate the sha256 flat digest,
and avoid having to pipe through the file contents again (via
blob_to_sha256_hash) to construct the sha256 digest.

Change-Id: I500b19dd9e4b7cc897d88b44547e7851559e5a4e
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11872
Tested-by: BuildkiteCI
Reviewed-by: Ilan Joselevich <personal@ilanjoselevich.com>
Reviewed-by: Connor Brewster <cbrewster@hey.com>
2024-06-26 04:51:31 +00:00
Florian Klink
080654aaf9 feat(tvix/eval): add file_type method
This allows peeking of the type at a given path.

It's necessary, as an open() might not fail until you try to read()
from it, and generally, stat'ing can be faster in some cases.

Change-Id: Ib002da3194a3546ca286de49aac8d1022ec5560f
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11871
Tested-by: BuildkiteCI
Reviewed-by: Ilan Joselevich <personal@ilanjoselevich.com>
Reviewed-by: Connor Brewster <cbrewster@hey.com>
2024-06-26 04:51:31 +00:00
Florian Klink
b992ca49a6 fix(tvix/glue/tvix_store_io): also populate input sources
These also need to be present in the input nodes of the BuildRequest.

Change-Id: Ie9b957805e42f766002581adc6182a6543c5333b
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11802
Reviewed-by: Brian Olsen <me@griff.name>
Tested-by: BuildkiteCI
Autosubmit: flokli <flokli@flokli.de>
2024-06-13 19:59:28 +00:00
Florian Klink
d947f61d36 fix(tvix/glue/tvix_store_io): disable concurrent fetches for now
We need some shared queue, preventing the same fetches/builds from
getting triggered multiple times unnecessarily.

Change-Id: I7c4a3c66db558f5cccd66865b170242b758e3e02
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11800
Reviewed-by: aspen <root@gws.fyi>
Autosubmit: flokli <flokli@flokli.de>
Tested-by: BuildkiteCI
2024-06-13 16:19:18 +00:00
Florian Klink
7f29cab1cc fix(tvix/glue/tvix_store_io): distinguish waiting and building
We immediately reported "Building", even though then populated necessary
inputs, which looked a bit odd. Make it clear we're still waiting, and
update the spinner message once we have all inputs we were waiting for.

In the future, we might want to have separate spans for this, so the
timer gets reset, but that's something for later.

Change-Id: Ic22c9a906d0e7e7179c5ee328162401261efc224
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11799
Reviewed-by: aspen <root@gws.fyi>
Tested-by: BuildkiteCI
Autosubmit: flokli <flokli@flokli.de>
2024-06-13 16:18:14 +00:00
Florian Klink
99c5a2e8bc feat(tvix/glue): report progress on all fetches, use progress bars
This should also report progress on fetches which we couldn't delay
until actually having to IO into them, like `builtins.fetchurl` calls
without a upfront-provided hash.

While at it, upgrade the progress spinners to progress bars, which
increment if we know the size of the fetch.

Change-Id: Ic3f332286d8bc2177f3d994ba25b165728d4b702
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11797
Autosubmit: flokli <flokli@flokli.de>
Tested-by: BuildkiteCI
Reviewed-by: aspen <root@gws.fyi>
2024-06-13 16:16:42 +00:00
Florian Klink
7ee55c293c fix(tvix/glue/tvix_store_io): use same case for progress messages
"Fetching" was uppercase, "building" was lowercase.
Let's make this consistent.

Change-Id: I11c16f1a7d2057ada4d057e553a4ceaa59597f26
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11796
Autosubmit: flokli <flokli@flokli.de>
Reviewed-by: Ilan Joselevich <personal@ilanjoselevich.com>
Tested-by: BuildkiteCI
Reviewed-by: flokli <flokli@flokli.de>
2024-06-13 11:59:58 +00:00
Florian Klink
ddd88a589b feat(tvix/glue/tvix_store_io): show progress info
In `store_path_to_node`, in case we need to build or fetch something,
render a progress bar, using the spinner for now.
We can upgrade this to a progress *bar* later.

Change-Id: I4a7cf5ef8f639076f176af9b39d276be3f37c8ff
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11793
Autosubmit: flokli <flokli@flokli.de>
Reviewed-by: Connor Brewster <cbrewster@hey.com>
Tested-by: BuildkiteCI
2024-06-12 23:43:54 +00:00
Florian Klink
842d6816bf feat(tvix/glue): support builtin:fetchurl
nixpkgs calls <nix/fetchurl.nix> during nixpkgs bootstrap.

This produces a fake derivation with system = builtin
and builder = builtin:fetchurl, and needs to download files from the
internet.

At the end of the Derivation construction, if we have such a derivation,
also synthesize a `Fetch` struct, which we add to the known fetch paths.

This will then cause these fetches to be picked up like all other
fetches in TvixStoreIO.

Change-Id: I72cbca4f85da106b25eda97693a6a6e59911cd57
Reviewed-on: https://cl.tvl.fyi/c/depot/+/10975
Reviewed-by: Connor Brewster <cbrewster@hey.com>
Tested-by: BuildkiteCI
2024-06-12 22:31:17 +00:00
Aspen Smith
72b9a126b8 feat(tvix/glue): Implement builtins.storePath
This one's relatively simple - we just check if the store path exists,
and if it does we make a new contextful string containing the store path
as its only context element.

Automatic testing seems tricky for this (I think?) so I tested it
manually:

tvix-repl> builtins.storePath /nix/store/yn46i4xx5alh7gs6fpkxk430i34rp2q9-hello-2.12.1
=> "/nix/store/yn46i4xx5alh7gs6fpkxk430i34rp2q9-hello-2.12.1" :: string

Change-Id: I8a0d9726e4102ab872c53c2419679c2c855a5a18
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11696
Tested-by: BuildkiteCI
Autosubmit: aspen <root@gws.fyi>
Reviewed-by: flokli <flokli@flokli.de>
2024-06-05 17:50:15 +00:00
Florian Klink
14766cfe1d refactor(tvix/store): drop calculate_nar from PathInfoService
This shouldn't be part of the PathInfoService trait.

Pretty much none of the PathInfoServices do implement it, and requiring
them to implement it means they also cannot make use of this calculation
already being done by other PathInfoServices.

Move it out into its own NarCalculationService trait, defined somewhere
at tvix_store::nar, and have everyone who wants to trigger nar
calculation use nar_calculation_service directly, which now is an
additional field in TvixStoreIO for example.

It being moved outside the PathInfoService trait doesn't prohibit
specific implementations to implement it (like the GRPC client for the
`PathInfoService` does.

This is currently wired together in a bit of a hacky fashion - as of
now, everything uses the naive implementation that traverses blob and
directoryservice, rather than composing it properly. I want to leave
that up to a later CL, dealing with other parts of store composition
too.

Change-Id: I18d07ea4301d4a07651b8218bc5fe95e4e307208
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11619
Reviewed-by: Connor Brewster <cbrewster@hey.com>
Autosubmit: flokli <flokli@flokli.de>
Tested-by: BuildkiteCI
2024-05-11 13:33:59 +00:00
Connor Brewster
da9bc274f3 refactor(tvix): remove usage of async-recursion
Rust 1.77 supports async recursion as long as there is some form of
indirection (ie. `Box::pin`). This removes the need to use the
async-recursion crate.

Change-Id: Ic9613ab7f32016f0103032a861edff92e2fb8b41
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11596
Reviewed-by: flokli <flokli@flokli.de>
Autosubmit: Connor Brewster <cbrewster@hey.com>
Tested-by: BuildkiteCI
2024-05-06 16:05:09 +00:00
Florian Klink
abc0553eb8 feat(tvix/castore/directory/traverse): use castore Paths
This switches from using std::path::Path to using castore paths.

We can drop some error handling in descend_to, as absolute (or redundant)
paths are not representable.

We however now need to convert from a std::path::Path to our
representation, and decide to accept .. canonicalization, as paths in
EvalIO might contain this. Dealing .. to hop into another store path, if
we encounter this, should be dealt with in a previous step.

Change-Id: I5e94693808420c5d56587c68731252b54755bf93
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11575
Autosubmit: flokli <flokli@flokli.de>
Reviewed-by: Connor Brewster <cbrewster@hey.com>
Tested-by: BuildkiteCI
2024-05-02 15:26:29 +00:00
Florian Klink
e18bc33529 fix(tvix/glue/tvix_store_io): remove early return
Doing the fetch comes up with the root node, but we still need to
descend from there to the desired subpath.

Move things around to ensure the fetch case also only sets root_node.

This logic should probably be moved into smaller, easier to consume
functions.

Change-Id: I6ab9317df794f53d2504029bbc77859e89fef1ed
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11507
Autosubmit: flokli <flokli@flokli.de>
Tested-by: BuildkiteCI
Reviewed-by: raitobezarius <tvl@lahfa.xyz>
2024-04-23 14:53:31 +00:00
Florian Klink
30950833c9 feat(tvix/glue/store_io): have KnownPaths track fetches too
Have fetcher builtins call queue_fetch() whenever they don't need to
fetch something immediately, and teach TvixStoreIO::store_path_to_node
on how to look up (and call ingest_and persist on our Fetcher).

Change-Id: Id4bd9d639fac9e4bee20c0b1c584148740b15c2f
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11501
Reviewed-by: raitobezarius <tvl@lahfa.xyz>
Tested-by: BuildkiteCI
Autosubmit: flokli <flokli@flokli.de>
2024-04-23 12:40:55 +00:00
Florian Klink
091de12a9a refactor(tvix/glue): move Fetch[er] into its own types, fetch lazily
We actually want to delay fetching until we actually need the file. A
simple evaluation asking for `.outPath` or `.drvPath` should work even
in a pure offline environment.

Before this CL, the fetching logic was quite distributed between
tvix_store_io, and builtins/fetchers.rs.

Rather than having various functions and conversions between structs,
describe a Fetch as an enum type, with the fields describing the fetch.

Define a store_path() function on top of `Fetch` which can be used to
ask for the calculated store path (if the digest has been provided
upfront).

Have a `Fetcher` struct, and give it a `fetch_and_persist` function,
taking a `Fetch` as well as a desired name, and have it deal with all
the logic of persisting the PathInfos. It also returns a StorePathRef,
similar to the `.store_path()` method on a `Fetch` struct.

In a followup CL, we can extend KnownPaths to track fetches AND
derivations, and then use `Fetcher` when we need to do IO into that
store path.

Change-Id: Ib39a96baeb661750a8706b461f8ba4abb342e777
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11500
Reviewed-by: raitobezarius <tvl@lahfa.xyz>
Autosubmit: flokli <flokli@flokli.de>
Tested-by: BuildkiteCI
2024-04-23 12:30:48 +00:00
Florian Klink
5fc403587f refactor(tvix/castore): ingest filesystem entries in parallel
Rather than carrying around an Future in the IngestionEntry::Regular,
simply carry the plain B3Digest.

Code reading through a non-seekable data stream has no choice but to
read and upload blobs immediately, and code seeking through something
seekable (like a filesystem) probably knows better what concurrency to
pick when ingesting, rather than the consuming side.

(Our only) one of these seekable source implementations is now doing
exactly that. We produce a stream of futures, and then use
[StreamExt::buffered] to process more than one, concurrently.

We still keep the same order, to avoid shuffling things and violating
the stream order.

This also cleans up walk_path_for_ingestion in castore/import, as well
as ingest_dir_entries in glue/tvix_store_io.

Change-Id: I5eb70f3e1e372c74bcbfcf6b6e2653eba36e151d
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11491
Autosubmit: flokli <flokli@flokli.de>
Reviewed-by: Connor Brewster <cbrewster@hey.com>
Tested-by: BuildkiteCI
2024-04-20 18:54:28 +00:00
Connor Brewster
01239a4f6f fix(tvix): fix outdated comment and error in TvixStoreIO::open
This function was originally called `read_to_string` but was changed to
`open` to make it so that file contents aren't always held in memory.
A comment and error message were not updated to reflect the new name of
this method.

Change-Id: I3d86e2f6d7006c2e1513121fc3c62efcb7e7b9bb
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11495
Reviewed-by: flokli <flokli@flokli.de>
Tested-by: BuildkiteCI
2024-04-20 18:49:16 +00:00
Aspen Smith
3107961428 feat(tvix/eval): Implement builtins.fetchTarball
Implement a first pass at the fetchTarball builtin.

This uses much of the same machinery as fetchUrl, but has the extra
complexity that tarballs have to be extracted and imported as store
paths (into the directory- and blob-services) before hashing. That's
reasonably involved due to the structure of those two services.

This is (unfortunately) not easy to test in an automated way, but I've
tested it manually for now and it seems to work:

    tvix-repl> (import ../. {}).third_party.nixpkgs.hello.outPath
    => "/nix/store/dbghhbq1x39yxgkv3vkgfwbxrmw9nfzi-hello-2.12.1" :: string

Co-authored-by: Connor Brewster <cbrewster@hey.com>
Change-Id: I57afc6b91bad617a608a35bb357861e782a864c8
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11020
Autosubmit: aspen <root@gws.fyi>
Reviewed-by: flokli <flokli@flokli.de>
Tested-by: BuildkiteCI
2024-04-20 14:58:04 +00:00
Florian Klink
e9db0449e7 refactor(tvix/castore/import): make module, split off fs and error
Move error types and filesystem-specific functions to a separate file,
and keep the fs:: namespace in public exports.

Change-Id: I5e9e83ad78d9aea38553fafc293d3e4f8c31a8c1
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11486
Tested-by: BuildkiteCI
Reviewed-by: Connor Brewster <cbrewster@hey.com>
Autosubmit: flokli <flokli@flokli.de>
2024-04-20 14:14:19 +00:00
Connor Brewster
259d7a3cfa refactor(tvix/castore): generalize store ingestion streams
Previously the store ingestion code was coupled to `walkdir::DirEntry`s
produced by the `walkdir` crate which made it impossible to reuse
ingesting from other sources like tarballs or NARs.

This introduces a `IngestionEntry` which carries enough information for
store ingestion and a future for computing the Blake3 digest of files.
This allows the producer to perform file uploads in a way that makes
sense for the source, ie. the filesystem upload could concurrently
upload multiple files at the same time, while the NAR ingestor will need
to ingest the entire blob before yielding the next blob in the stream.
In the future we can buffer small blobs and upload them concurrently,
but the full blob still needs to be read from the NAR before advancing.

Change-Id: I6d144063e2ba5b05e765bac1f27d41b3c8e7b283
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11462
Reviewed-by: flokli <flokli@flokli.de>
Tested-by: BuildkiteCI
2024-04-19 20:37:05 +00:00
Connor Brewster
63116d8c21 fix(tvix): Avoid buffering file into memory in builtins.hashFile
Right now `builtins.hashFile` always reads the entire file into memory
before hashing, which is not ideal for large files. This replaces
`read_to_string` with `open_file` which allows calculating the hash of
the file without buffering it entirely into memory. Other callers can
continue to buffer into memory if they choose, but they still use the
`open_file` VM request and then call `read_to_string` or `read_to_end`
on the `std::io::Reader`.

Fixes b/380

Change-Id: Ifa1c8324bcee8f751604b0b449feab875c632fda
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11236
Reviewed-by: flokli <flokli@flokli.de>
Tested-by: BuildkiteCI
2024-04-09 17:31:58 +00:00
Ryan Lahfa
cecb5e295a feat(tvix/eval): implement builtins.path
Now, it supports almost everything except `recursive = false;`, i.e. `flat`-ingestion
because we have no knob exposed in the tvix store import side to do it.

This has been tested to work.

Change-Id: I2e9da10ceccdfbf45b43c532077ed45d6306aa98
Reviewed-on: https://cl.tvl.fyi/c/depot/+/10597
Tested-by: BuildkiteCI
Autosubmit: raitobezarius <tvl@lahfa.xyz>
Reviewed-by: flokli <flokli@flokli.de>
2024-04-01 12:30:26 +00:00
Ryan Lahfa
14fe65a50b refactor(tvix/store): generalize PathInfo constructors
Instead of enforcing NAR SHA256 all the time, we generalize the
`PathInfo` constructor to take a `CAHash` argument which can drive
whether we are having a flat, NAR or text scheme.

With this, it is now possible to implement flat schemes in our
evaluation builtins, e.g. `builtins.path`.

Change-Id: I15bfee0ef4f0f428bfbd2f30c57c012cdcf6a976
Signed-off-by: Ryan Lahfa <tvl@lahfa.xyz>
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11286
Reviewed-by: flokli <flokli@flokli.de>
Tested-by: BuildkiteCI
2024-04-01 12:30:26 +00:00
Florian Klink
156a5a0fb6 refactor(tvix/glue): drop ingest_entries_sync
Make this function async, and do the block_on on the (single) callsite.

Change-Id: Ib8b0b54ab5370fe02ef95f38a45d8866868a9d60
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11285
Reviewed-by: Connor Brewster <cbrewster@hey.com>
Tested-by: BuildkiteCI
2024-03-28 21:17:20 +00:00
Florian Klink
bd32024047 refactor(tvix/glue): drop register_node_in_path_info_service_sync
Replace the (single) callsite with some code interacting with the tokio
runtime to block on the async version.

Change-Id: I3976496ae77b2bb8734603f303655834265e3f0a
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11284
Tested-by: BuildkiteCI
Reviewed-by: Connor Brewster <cbrewster@hey.com>
2024-03-28 21:17:20 +00:00
Florian Klink
f1e6f98072 feat(tvix/glue/tvix_store_io): drop store_path_to_node_sync
Let's get rid of these sync helpers, they make this less understandable.

Change-Id: I3c7294647849db2747762722247c65e4e2947757
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11283
Reviewed-by: Connor Brewster <cbrewster@hey.com>
Tested-by: BuildkiteCI
2024-03-28 21:17:20 +00:00
Aspen Smith
de727bccf9 feat(tvix/glue): Implement builtins.fetchurl
Implement the fetchurl builtin, and lay the groundwork for implementing
the fetchTarball builtin (which works very similarly, and is implemented
using almost the same code in C++ nix).

An overview of how this works:

1. First, we check if the store path that *would* result from the
   download already exists in the store - if it does, we just return
   that
2. If we need to download the URL, TvixStoreIO has an `http_client:
   reqwest::Client` field now which we use to make the request
3. As we're downloading the blob, we hash the data incrementally into a
   SHA256 hasher
4. We compare the hash against the expected hash (if any) and bail out
   if it doesn't match
5. Finally, we put the blob in the store and return the store path

Since the logic is very similar, this commit also implements a *chunk*
of `fetchTarball` (though the actual implementation will likely include
a refactor to some of the code reuse here).

The main thing that's missing here is caching of downloaded blobs when
fetchurl is called without a hash - I've opened b/381 to track the TODO
there.

Adding the `SSL_CERT_FILE` here is necessary to teach reqwest how to
load it during tests - see 1c16dee20 (feat(tvix/store): use reqwests'
rustls-native-roots feature, 2024-03-03) for  more info.

Change-Id: I83c4abbc7c0c3bfe92461917e23d6d3430fbf137
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11017
Tested-by: BuildkiteCI
Reviewed-by: flokli <flokli@flokli.de>
Autosubmit: aspen <root@gws.fyi>
2024-03-11 02:21:54 +00:00
Florian Klink
771200df7c fix(tvix/eval): allow reading non-UTF8 files
With our values using bstr now, we're not restricted to only reading
files that contain valid UTF-8.

Update our `read_to_string` function to `read_to_end`
(named like `std::io::Read::read_to_end`), and have it return a Vec<u8>.

Change-Id: I87f0291dc855a132689576559c891d66c30ddf2b
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11003
Tested-by: BuildkiteCI
Autosubmit: flokli <flokli@flokli.de>
Reviewed-by: Pádraic Ó Mhuiris <patrick.morris.310@gmail.com>
Reviewed-by: flokli <flokli@flokli.de>
2024-02-21 13:55:41 +00:00
Peter Kolloch
fde488ec6d feat(tvix/nix-compat): Use StorePath in Output
https: //b.tvl.fyi/issues/264
Change-Id: Icb09be9643245cc68d09f01d7723af2d44d6bd1a
Reviewed-on: https://cl.tvl.fyi/c/depot/+/11001
Autosubmit: Peter Kolloch <info@eigenvalue.net>
Reviewed-by: flokli <flokli@flokli.de>
Tested-by: BuildkiteCI
2024-02-21 11:38:03 +00:00
Peter Kolloch
c06fb01b3b feat(tvix/nix-compat): input_derivations with StorePaths
...in `Derivation`.

This is more type-safe and should consume less memory.

This also removes some allocations in the potentially hot path of output hash calculation.

https: //b.tvl.fyi/issues/264
Change-Id: I6ad7d3cb868dc9f750894d449a6065608ef06e8c
Reviewed-on: https://cl.tvl.fyi/c/depot/+/10957
Tested-by: BuildkiteCI
Reviewed-by: flokli <flokli@flokli.de>
Autosubmit: Peter Kolloch <info@eigenvalue.net>
Reviewed-by: Peter Kolloch <info@eigenvalue.net>
2024-02-21 11:34:24 +00:00
Ryan Lahfa
7388078630 feat(tvix/eval): implement builtins.filterSource
We add a new set of builtins called `import_builtins`, which
will contain import-related builtins, such as `builtins.path` and
`builtins.filterSource`. Both can import paths into the store, with
various knobs to alter the result, e.g. filtering, renaming, expected
hashes.

We introduce `filtered_ingest` which will drive the filtered ingestion
via the Nix function via the generator machinery, and then we register
the root node to the path info service inside the store.

`builtins.filterSource` is very simple, `builtins.path` is a more
complicated model requiring the same logic albeit more sophisticated
with name customization, file ingestion method and expected SHA-256.

Change-Id: I1083f37808b35f7b37818c8ffb9543d9682b2de2
Reviewed-on: https://cl.tvl.fyi/c/depot/+/10654
Autosubmit: raitobezarius <tvl@lahfa.xyz>
Tested-by: BuildkiteCI
Reviewed-by: flokli <flokli@flokli.de>
2024-02-20 14:16:36 +00:00
Aspen Smith
0db46dacea feat(tvix/glue): Init fetcher builtins
Initialize a new empty builtins module `fetcher_builtins`, which will
contain the builtins which fetch URLs from the internet:

* fetchurl
* fetchGit
* fetchTarball
* fetchTree (maybe? this is experimental)

These builtins are all implemented in CPP nix at:
https://github.com/NixOS/nix/blob/2.20.2/src/libexpr/primops/fetchTree.cc

These builtins are added to the evaluation context using a similar
mechanism to the derivation builtins, and have been added everywhere
derivation builtins were previously being added.

Change-Id: I133b91cc9560f23028621414537f712e7bd8a825
Reviewed-on: https://cl.tvl.fyi/c/depot/+/10974
Autosubmit: aspen <root@gws.fyi>
Reviewed-by: flokli <flokli@flokli.de>
Tested-by: BuildkiteCI
2024-02-19 16:51:10 +00:00
Florian Klink
c6605992c0 feat(tvix/glue): drive builds on IO
That's one possible abstraction to drive builds.
Whenever IO into a store path is requested, we look up the root node,
and in case we don't have it in PathInfoService, but KnownPaths gives us
a Derivation for that output path, trigger a build and await the result.

This recursively might trigger builds for parent paths if they haven't
been built yet.

Another option would be to simply expose a PathInfoService interface for
a builder too, and loop all building into IO via PathInfoService
composition - but let's start with something.

Note tvix-cli doesn't have a configurable BuildService yet, it's plugged
to the DummyBuildService, so whenever it needs to do a build, it'll fail,
but that's how it can be provoked:

```
(builtins.readFile (import <nixpkgs> {}).hello.outPath + "/bin/hello")
[…]
error[E029]: I/O error: /nix/store/cg8a576pz2yfc1wbhxm1zy4x7lrk8pix-hello-2.12.1: builds are not supported with DummyBuildService
 --> [code]:1:2
  |
1 | (builtins.readFile (import <nixpkgs> {}).hello.outPath + "/bin/hello")
  |  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
```

Note how this fails, while pure output path calculation
(`(import <nixpkgs> {}).hello.outPath + "/bin/hello")`) still succeeds.

Change-Id: Id2075d8a2b18554d0dd608b4b29146a8cd411e7f
Reviewed-on: https://cl.tvl.fyi/c/depot/+/10793
Reviewed-by: raitobezarius <tvl@lahfa.xyz>
Tested-by: BuildkiteCI
2024-02-18 08:59:49 +00:00
Florian Klink
8253d91eaa fix(tvix/glue): don't emit ret as INFO
This causes a bit too much spam otherwise.

Change-Id: If3ced9ddfee7f49453711cd26469d1eb81983c71
Reviewed-on: https://cl.tvl.fyi/c/depot/+/10953
Tested-by: BuildkiteCI
Reviewed-by: tazjin <tazjin@tvl.su>
2024-02-18 07:12:27 +00:00
Florian Klink
87bda3ae7a feat(tvix/glue): tune instrumentations in TvixStoreIO
Print store paths with their ToString implementation for brevity, and
don't log the sucessful return value of read_to_string.

Change-Id: I01b6838398acd66b8818095622f361fcca26fa77
Reviewed-on: https://cl.tvl.fyi/c/depot/+/10854
Autosubmit: flokli <flokli@flokli.de>
Reviewed-by: raitobezarius <tvl@lahfa.xyz>
Tested-by: BuildkiteCI
2024-02-17 08:01:04 +00:00
Aspen Smith
e3c92ac3b4 fix(tvix/eval): Replace inner NixString repr with Box<Bstr>
Storing a full BString here incurs the extra overhead of the capacity
for the inner byte-vector, which we basically never use as Nix strings
are immutable (and we don't do any mutation / sharing analysis).
Switching to a Box<BStr> cuts us from 72 bytes to 64 bytes per
string (and there are a lot of strings!)

Change-Id: I11f34c14a08fa02759f260b1c78b2a2b981714e4
Reviewed-on: https://cl.tvl.fyi/c/depot/+/10794
Autosubmit: aspen <root@gws.fyi>
Reviewed-by: flokli <flokli@flokli.de>
Tested-by: BuildkiteCI
2024-02-13 16:49:53 +00:00