docs(tvix): initial notes on a possible generic Nix lang test suite
This kind of collects points to consider which should hopefully help in figuring out what such a lang test suite could or should look like exactly—which is something I currently struggle somewhat. Change-Id: If4f47546fe4b8046fb79718743fa9a72f9801876 Reviewed-on: https://cl.tvl.fyi/c/depot/+/10657 Reviewed-by: raitobezarius <tvl@lahfa.xyz> Tested-by: BuildkiteCI Reviewed-by: flokli <flokli@flokli.de> Autosubmit: sterni <sternenseemann@systemli.org>
This commit is contained in:
		
							parent
							
								
									68bba48d59
								
							
						
					
					
						commit
						bdeb6f406e
					
				
					 1 changed files with 140 additions and 0 deletions
				
			
		
							
								
								
									
										140
									
								
								tvix/nix-lang-test-suite/README.md
									
										
									
									
									
										Normal file
									
								
							
							
						
						
									
										140
									
								
								tvix/nix-lang-test-suite/README.md
									
										
									
									
									
										Normal file
									
								
							|  | @ -0,0 +1,140 @@ | |||
| # The Implementation Independent Nix Language Test Suite | ||||
| 
 | ||||
| ## Design Notes | ||||
| 
 | ||||
| ### Requirements | ||||
| 
 | ||||
| - It should work with potentially any Nix implementation and with all serious | ||||
|   currently available ones (C++ Nix, hnix, Tvix, …). How much of it the | ||||
|   implementations pass, is of course an orthogonal question. | ||||
| - It should be easy to add test cases, independent of any specific | ||||
|   implementation. | ||||
| - It should be simple to ignore test cases and mark know failures | ||||
|   (similar to the notyetpassing mechanism in the Tvix test suite). | ||||
| 
 | ||||
| ### Test Case Types | ||||
| 
 | ||||
| This is a summary of relevant kinds of test cases that can be found in the wild, | ||||
| usually testing some kind of concrete implementation, but also doubling up as a | ||||
| potential test case for _any_ Nix implementation. For the most part, this is the | ||||
| `lang` test suite of C++ Nix which is also used by Tvix and hnix. | ||||
| 
 | ||||
| - **parse** test cases: Parsing the given expression should either *succeed* or | ||||
|   *fail*. | ||||
| 
 | ||||
|   - C++ Nix doesn't have any expected output for the success cases while | ||||
|     `rnix-parser` checks them against its own textual AST representation. | ||||
|   - For the failure cases, `rnix-parser` and C++ Nix (as of recently) have | ||||
|     expected error messages/representations. | ||||
| 
 | ||||
|   Both error and failure cases probably are hard to implement against expected | ||||
|   output/error messages for a generic test suite. Even if standardized error | ||||
|   codes are implemented (see below), it is doubtful whether it'd be useful | ||||
|   to have a dedicated code for every kind of parse/lex failure. | ||||
| - (strict) **eval** test cases: Evaluating the given expression should either | ||||
|   *fail* or *succeed* and yield a given result. | ||||
| 
 | ||||
|   - **eval-okay** (success) tests currently require three things: | ||||
| 
 | ||||
|     1. Successful evaluation after deeply forcing and printing the evaluation | ||||
|        result (i.e. `nix-instantiate --eval --strict`) | ||||
|     2. That the output matches an expected output exactly (string equality). | ||||
|        For this the output of `nix-instantiate(1)` is used, sometimes with | ||||
|        the addition of the `--xml --no-location` or `--json` flags. | ||||
|     3. Optionally, stderr may need to be equal to an expected string exactly | ||||
|        which would test e.g. `builtins.trace` messages or deprecation warnings | ||||
|        (C++ Nix). | ||||
| 
 | ||||
|        This extra check is currently not supported by the Tvix test suite. | ||||
| 
 | ||||
|   - **eval-fail** tests require that the given expression fails to evaluate. C++ | ||||
|     Nix has recently started to also check the error messages via the stderr | ||||
|     mechanism described above. This is not supported by Tvix at the moment. | ||||
| - _lazy_ eval test cases: This is currently only supported by the `nix_oracle` | ||||
|   test suite in Tvix which compares the evaluation result of expressions to the | ||||
|   output of `nix-instantiate(1)` without `--strict`. By relying on the fact | ||||
|   that the resulting value is not forced deeply before printing, it can be | ||||
|   observed whether certain expressions are thunked or not. | ||||
| 
 | ||||
|   This is somewhat fragile as permissible optimizations may prevent a thunk from | ||||
|   being created. However, this should not be an issue if the cases are chosen | ||||
|   carefully. Empirically, this test suite was useful for catching some instances | ||||
|   of overzealous evaluation early in development of Tvix. | ||||
| 
 | ||||
| - **identity** test cases require that the given expression evaluates to a | ||||
|   value whose printed representation is the same (string equal to) the original | ||||
|   expression. Such test cases only exist in the Tvix test suite. | ||||
| 
 | ||||
|   Of course only a limited number of expression satisfy this, but it is | ||||
|   useful for testing `nix-instantiate(1)` style value printing. Consequently, | ||||
|   it is kind of on the edge of what you can call a language test. | ||||
| 
 | ||||
| ### Extra Dependencies of Some Test Cases | ||||
| 
 | ||||
| - **Filesystem**: Some test cases `import` other files or use `builtins.readFile`, | ||||
|   `builtins.readDir` and friends. | ||||
| - **Working and Home Directory**: Tests involving relative and home relative paths | ||||
|   need knowledge of the current and home directory to correctly interpret the output. | ||||
|   C++ Nix does a [search and replace on the test output for this purpose][cpp-nix-pwd-sed] | ||||
| - **Nix Store**: Some tests add files to the store, either via path interpolation, | ||||
|   `builtins.toFile` or `builtins.derivation`. | ||||
| 
 | ||||
|   Additionally, it should be considered that Import-from-Derivation may be | ||||
|   interesting to test in the future. Currently, the Tvix and C++ Nix test | ||||
|   suites all pass with Import-from-Derivation disabled, i.e. a dummy store | ||||
|   implementation is enough. | ||||
| 
 | ||||
|   Note that the absence of a store dependency ideally also influences the test | ||||
|   execution: In Tvix, for example, store independent tests can be executed | ||||
|   with a store backend that immediately errors out, verifying that the test | ||||
|   is, in fact, store independent. | ||||
| - **Environment**: The C++ Nix test suite sets a single environment variable, | ||||
|   `TEST_VAR=foo`. Additionally, `NIX_PATH` and `HOME` are sometimes set (the | ||||
|   latter is probably not a great idea, since it is not terribly reliable). | ||||
| - **Nix Path**: A predetermined Nix Path (via `NIX_PATH` and/or command line | ||||
|   arguments) needs to be set for some test cases. | ||||
| - **Nix flags**: Some tests need to have extra flags passed to `nix-instantiate(1)` | ||||
|   in order to work. This is done using a `.flags` file | ||||
| 
 | ||||
| ### Expected Output Considerations | ||||
| 
 | ||||
| #### Success | ||||
| 
 | ||||
| The expected output of `eval-okay` test cases (which are the majority of test | ||||
| cases) uses the standard strict output of `nix-instantiate(1)` in most cases | ||||
| which is nice to read and easy to work with. However, some more obscure aspects | ||||
| of this output inevitably leak into the test cases, namely the cycle detection | ||||
| and printing and (in the case of Tvix) the printing of thunks. Unfortunately, | ||||
| the output has been changed after Nix 2.3, bringing it closer to the output of | ||||
| `nix eval`, but in an inconsistent manner (e.g. `<CYCLE>` was changed to | ||||
| `«repeated»`, but `<LAMBDA>` remained). As a consequence, it is not always | ||||
| possible to write C++ Nix version independent test cases. | ||||
| 
 | ||||
| It is unclear whether a satisfying solution (for a common test suite) can | ||||
| be achieved here as it has become a somewhat contentious [issue whether | ||||
| or not nix-instantiate should have a stable output](cpp-nix-attr-elision-printing-pr). | ||||
| 
 | ||||
| A solution may be to use the XML output, specifically the `--xml --no-location` | ||||
| flags to `nix-instantiate(1)` for some of these instances. As it (hopefully) | ||||
| corresponds to `builtins.toXML`, there should be a greater incentive to keep it | ||||
| stable. It does support (only via `nix-instantiate(1)`, though) printing | ||||
| unevaluated thunks, but has no kind of cycle detection (which is fair enough for | ||||
| its intended purpose). | ||||
| 
 | ||||
| #### Failure | ||||
| 
 | ||||
| C++ Nix has recently (some time after Nix 2.3, probably much later actually) | ||||
| started checking error messages via expected stderr output. This naturally | ||||
| won't work for a implementation independent language test suite: | ||||
| 
 | ||||
| - It is fine to have differing phrasing for error messages or localize them. | ||||
| - Printed error positions and stack traces may be slightly different depending | ||||
|   on implementation internals. | ||||
| - Formatting will almost certainly differ. | ||||
| 
 | ||||
| Consequently, just checking for failure when running the test suite should be | ||||
| an option. Long term, it may be interesting to have standardized error codes | ||||
| and portable error code reporting. | ||||
| 
 | ||||
| [cpp-nix-pwd-sed]: https://github.com/NixOS/nix/blob/2cb9c7c68102193e7d34fabe6102474fc7f98010/tests/functional/lang.sh#L109 | ||||
| [cpp-nix-attr-elision-printing-pr]: https://github.com/NixOS/nix/pull/9606 | ||||
		Loading…
	
	Add table
		Add a link
		
	
		Reference in a new issue