Commit graph

60 commits

Author SHA1 Message Date
Adam Joseph
1e80b9ea8b chore(tvix/eval): mark async functions which are called by the VM
Given Rust's current lack of support for tail calls, we cannot avoid
using `async` for builtins.  This is the only way to avoid
overflowing the cpu stack when we have arbitrarily deep
builtin/interpreted/builtin/interpreted/... "sandwiches"

There are only five `async fn` functions which are not builtins
(some come in multiple "flavors"):

- add_values
- resolve_with
- force, final_deep_force
- nix_eq, nix_cmp_eq
- coerce_to_string

These can be written iteratively rather than recursively (and in
fact nix_eq used to be written that way!).  I volunteer to rewrite
them.  If written iteratively they would no longer need to be
`async`.

There are two motivations for limiting our reliance on `async` to
only the situation (builtins) where we have no other choice:

1. Performance.

   We don't really have any good measurement of the performance hit
   that the Box<dyn Future>s impose on us.  Right now all of our
   large (nixpkgs-eval) tests are swamped by the cost of other
   things (e.g. fork()ing `nix-store`) so we can't really measure
   it.  Builtins tend to be expensive operations anyways
   (regexp-matching, sorting, etc) that are likely to already cost
   more than the `async` overhead.

2. Preserving the ability to switch to `musttail` calls.

   Clang/LLVM recently got `musttail` (mandatory-elimination tail
   calls).  Rust has refused to add this mainly because WASM doesn't
   support, but WASM `tail_call` has been implemented and was
   recently moved to phase 4 (standardization).  It is very likely
   that Rust will get tail calls sometime in the next year; if it
   does, we won't need async anymore.  In the meantime, I'd like to
   avoid adding any further reliance on `async` in places where it
   wouldn't be straightforward to replace it with a tail call.

https://reviews.llvm.org/D99517

https://github.com/WebAssembly/proposals/pull/157

https: //github.com/rust-lang/rfcs/issues/2691#issuecomment-1462152908
Change-Id: Id15945d5a92bf52c16d93456e3437f91d93bdc57
Reviewed-on: https://cl.tvl.fyi/c/depot/+/8290
Reviewed-by: tazjin <tazjin@tvl.su>
Tested-by: BuildkiteCI
Autosubmit: Adam Joseph <adam@westernsemico.com>
2023-03-13 21:33:58 +00:00
Adam Joseph
e7a534e0c6 refactor(tvix/eval): reduce fetch{forced|captured}_with visibility
This commit moves fetch_forced_with and fetch_captured_with into the
scope of their only caller (resolve_with).

Change-Id: I9a8bc27228888729d591e8cb021c431b2b6468f5
Reviewed-on: https://cl.tvl.fyi/c/depot/+/8289
Autosubmit: Adam Joseph <adam@westernsemico.com>
Reviewed-by: tazjin <tazjin@tvl.su>
Tested-by: BuildkiteCI
2023-03-13 21:33:58 +00:00
Vincent Ambo
b3f8d66a6a chore(tvix/eval): prune some dependencies & features
* We no longer need backtrace-on-stack-overflow, as we no longer
  overflow the stack with the recent eval refactorings. This was weird
  voodoo anyways, introduced earlier to debug some cases where stack
  overflows occured.

* default features of genawaiter crate are not needed, as we don't use
  their proc macros

Change-Id: I346fc5a18d7f117ee805909a8be8f535b96be76c
Reviewed-on: https://cl.tvl.fyi/c/depot/+/8263
Reviewed-by: flokli <flokli@flokli.de>
Tested-by: BuildkiteCI
Reviewed-by: raitobezarius <tvl@lahfa.xyz>
2023-03-13 20:30:59 +00:00
Vincent Ambo
94513525b9 refactor(tvix/eval): reorder bytecode operations match by frequency
This reorders the operations in the VM's main `match` statement while
evaluating bytecode according to the frequency with which these
operations appear in some nixpkgs evaluations.

I used raw data that looks like this:
https://gist.github.com/tazjin/63d0788a78eb8575b04defaad4ef610d

This has a small but noticeable impact on evaluation performance.

No operations have changed in any way, this is purely moving code
around.

Change-Id: Iaa4ef4f0577e98144e8905fec88149c41e8c315c
Reviewed-on: https://cl.tvl.fyi/c/depot/+/8262
Reviewed-by: raitobezarius <tvl@lahfa.xyz>
Reviewed-by: flokli <flokli@flokli.de>
Tested-by: BuildkiteCI
2023-03-13 20:30:59 +00:00
Vincent Ambo
cc59cbf3e2 refactor(tvix/eval): rename VM::tail_call_value -> VM::call_value
The name of this was not accurate anymore after all the recent
shuffling, as noted by amjoseph. Conceptual tail calls here only occur
for Nix bytecode calling Nix bytecode, but things like a builtin call
actually push a new native frame.

Change-Id: I1dea8c9663daf86482b8c7b5a23133254b5ca321
Reviewed-on: https://cl.tvl.fyi/c/depot/+/8256
Tested-by: BuildkiteCI
Reviewed-by: raitobezarius <tvl@lahfa.xyz>
2023-03-13 20:30:59 +00:00
Vincent Ambo
43b0416bd8 fix(tvix/eval): more closely line up path resolution with cppnix
... except now the tests fail, but at least it works

Change-Id: I05e86c173f40533ae65548585c1ddaa200ac5235
Reviewed-on: https://cl.tvl.fyi/c/depot/+/8214
Reviewed-by: raitobezarius <tvl@lahfa.xyz>
Tested-by: BuildkiteCI
2023-03-13 20:30:59 +00:00
Vincent Ambo
c700776733 refactor(tvix/eval): VM struct no longer needs to be public
Change-Id: I93b485ddd280cc15fcbaecf4aed5fcd22e28a8a8
Reviewed-on: https://cl.tvl.fyi/c/depot/+/8212
Reviewed-by: raitobezarius <tvl@lahfa.xyz>
Tested-by: BuildkiteCI
2023-03-13 20:30:59 +00:00
Vincent Ambo
1e37f8b52e feat(tvix/eval): give generators human-readable names
This adds static strings to generator frames that describe the
generator in a human-readable fashion, which are then logged in
observers.

This makes runtime traces very precise, explaining exactly what is
being requested from where.

Change-Id: I695659a6bd0b7b0bdee75bc8049651f62b150e0c
Reviewed-on: https://cl.tvl.fyi/c/depot/+/8206
Tested-by: BuildkiteCI
Reviewed-by: raitobezarius <tvl@lahfa.xyz>
2023-03-13 20:30:59 +00:00
Vincent Ambo
43d04d9b98 refactor(tvix/eval): box PathBuf
This shaves another 8 bytes off Value. How did that type get so big?!

Change-Id: I65e9b59a1636bd57e3cc4aec5fea16887070b832
Reviewed-on: https://cl.tvl.fyi/c/depot/+/8153
Reviewed-by: raitobezarius <tvl@lahfa.xyz>
Tested-by: BuildkiteCI
2023-03-13 20:30:59 +00:00
Vincent Ambo
025c67bf4d refactor(tvix/eval): flatten call stack of VM using generators
Warning: This is probably the biggest refactor in tvix-eval history,
so far.

This replaces all instances of trampolines and recursion during
evaluation of the VM loop with generators. A generator is an
asynchronous function that can be suspended to yield a message (in our
case, vm::generators::GeneratorRequest) and receive a
response (vm::generators::GeneratorResponsee).

The `genawaiter` crate provides an interpreter for generators that can
drive their execution and lets us move control flow between the VM and
suspended generators.

To do this, massive changes have occured basically everywhere in the
code. On a high-level:

1. The VM is now organised around a frame stack. A frame is either a
   call frame (execution of Tvix bytecode) or a generator frame (a
   running or suspended generator).

   The VM has an outer loop that pops a frame off the frame stack, and
   then enters an inner loop either driving the execution of the
   bytecode or the execution of a generator.

   Both types of frames have several branches that can result in the
   frame re-enqueuing itself, and enqueuing some other work (in the
   form of a different frame) on top of itself. The VM will eventually
   resume the frame when everything "above" it has been suspended.

   In this way, the VM's new frame stack takes over much of the work
   that was previously achieved by recursion.

2. All methods previously taking a VM have been refactored into async
   functions that instead emit/receive generator messages for
   communication with the VM.

   Notably, this includes *all* builtins.

This has had some other effects:

- Some test have been removed or commented out, either because they
  tested code that was mostly already dead (nix_eq) or because they
  now require generator scaffolding which we do not have in place for
  tests (yet).

- Because generator functions are technically async (though no async
  IO is involved), we lose the ability to use much of the Rust
  standard library e.g. in builtins. This has led to many algorithms
  being unrolled into iterative versions instead of iterator
  combinations, and things like sorting had to be implemented from scratch.

- Many call sites that previously saw a `Result<..., ErrorKind>`
  bubble up now only see the result value, as the error handling is
  encapsulated within the generator loop.

  This reduces number of places inside of builtin implementations
  where error context can be attached to calls that can fail.
  Currently what we gain in this tradeoff is significantly more
  detailed span information (which we still need to bubble up, this
  commit does not change the error display).

  We'll need to do some analysis later of how useful the errors turn
  out to be and potentially introduce some methods for attaching
  context to a generator frame again.

This change is very difficult to do in stages, as it is very much an
"all or nothing" change that affects huge parts of the codebase. I've
tried to isolate changes that can be isolated into the parent CLs of
this one, but this change is still quite difficult to wrap one's mind
and I'm available to discuss it and explain things to any reviewer.

Fixes: b/238, b/237, b/251 and potentially others.
Change-Id: I39244163ff5bbecd169fe7b274df19262b515699
Reviewed-on: https://cl.tvl.fyi/c/depot/+/8104
Reviewed-by: raitobezarius <tvl@lahfa.xyz>
Reviewed-by: Adam Joseph <adam@westernsemico.com>
Tested-by: BuildkiteCI
2023-03-13 20:30:59 +00:00