refactor(tvix/store/blobsvc): make BlobStore async

We previously kept the trait of a BlobService sync.

This however had some annoying consequences:

 - It became more and more complicated to track when we're in a context
   with an async runtime in the context or not, producing bugs like
   https://b.tvl.fyi/issues/304
 - The sync trait shielded away async clients from async worloads,
   requiring manual block_on code inside the gRPC client code, and
   spawn_blocking calls in consumers of the trait, even if they were
   async (like the gRPC server)
 - We had to write our own custom glue code (SyncReadIntoAsyncRead)
   to convert a sync io::Read into a tokio::io::AsyncRead, which already
   existed in tokio internally, but upstream ia hesitant to expose.

This now makes the BlobService trait async (via the async_trait macro,
like we already do in various gRPC parts), and replaces the sync readers
and writers with their async counterparts.

Tests interacting with a BlobService now need to have an async runtime
available, the easiest way for this is to mark the test functions
with the tokio::test macro, allowing us to directly .await in the test
function.

In places where we don't have an async runtime available from context
(like tvix-cli), we can pass one down explicitly.

Now that we don't provide a sync interface anymore, the (sync) FUSE
library now holds a pointer to a tokio runtime handle, and needs to at
least have 2 threads available when talking to a blob service (which is
why some of the tests now use the multi_thread flavor).

The FUSE tests got a bit more verbose, as we couldn't use the
setup_and_mount function accepting a callback anymore. We can hopefully
move some of the test fixture setup to rstest in the future to make this
less repetitive.

Co-Authored-By: Connor Brewster <cbrewster@hey.com>
Change-Id: Ia0501b606e32c852d0108de9c9016b21c94a3c05
Reviewed-on: https://cl.tvl.fyi/c/depot/+/9329
Reviewed-by: Connor Brewster <cbrewster@hey.com>
Tested-by: BuildkiteCI
Reviewed-by: raitobezarius <tvl@lahfa.xyz>
This commit is contained in:
Florian Klink 2023-09-13 14:20:21 +02:00 committed by flokli
parent 3de9601764
commit da6cbb4a45
25 changed files with 1700 additions and 1002 deletions

View file

@ -6,8 +6,6 @@ use std::sync::Arc;
use std::{
collections::HashMap,
fmt::Debug,
fs::File,
io,
os::unix::prelude::PermissionsExt,
path::{Path, PathBuf},
};
@ -57,7 +55,7 @@ impl From<super::Error> for Error {
//
// It assumes the caller adds returned nodes to the directories it assembles.
#[instrument(skip_all, fields(entry.file_type=?&entry.file_type(),entry.path=?entry.path()))]
fn process_entry(
async fn process_entry(
blob_service: Arc<dyn BlobService>,
directory_putter: &mut Box<dyn DirectoryPutter>,
entry: &walkdir::DirEntry,
@ -102,16 +100,17 @@ fn process_entry(
.metadata()
.map_err(|e| Error::UnableToStat(entry.path().to_path_buf(), e.into()))?;
let mut file = File::open(entry.path())
let mut file = tokio::fs::File::open(entry.path())
.await
.map_err(|e| Error::UnableToOpen(entry.path().to_path_buf(), e))?;
let mut writer = blob_service.open_write();
let mut writer = blob_service.open_write().await;
if let Err(e) = io::copy(&mut file, &mut writer) {
if let Err(e) = tokio::io::copy(&mut file, &mut writer).await {
return Err(Error::UnableToRead(entry.path().to_path_buf(), e));
};
let digest = writer.close()?;
let digest = writer.close().await?;
return Ok(proto::node::Node::File(proto::FileNode {
name: entry.file_name().as_bytes().to_vec().into(),
@ -137,7 +136,7 @@ fn process_entry(
/// caller to possibly register it somewhere (and potentially rename it based on
/// some naming scheme.
#[instrument(skip(blob_service, directory_service), fields(path=?p))]
pub fn ingest_path<P: AsRef<Path> + Debug>(
pub async fn ingest_path<P: AsRef<Path> + Debug>(
blob_service: Arc<dyn BlobService>,
directory_service: Arc<dyn DirectoryService>,
p: P,
@ -175,7 +174,8 @@ pub fn ingest_path<P: AsRef<Path> + Debug>(
&mut directory_putter,
&entry,
maybe_directory,
)?;
)
.await?;
if entry.depth() == 0 {
return Ok(node);