Clickhouse also has column compression, configurable with the
output_format_parquet_compression_method setting.
It defaults to lz4, and the previous setting got a a zstd-compressed
parquet file with lz4 data.
Set output_format_parquet_compression_method to zstd instead, and sort
by timestamp before assembling the parquet file.
The existing files were updated to the same format with the following query:
```
SELECT * FROM file('bucket_logs_2023-11-11*.pq', 'Parquet', 'auto') ORDER BY timestamp ASC INTO OUTFILE 'bucket_logs_2023-11-11.parquet' SETTINGS output_format_parquet_compression_method = 'zstd'
```
Change-Id: Id63b14c82e7bf4b9907a500528b569a51e277751
Reviewed-on: https://cl.tvl.fyi/c/depot/+/10008
Reviewed-by: raitobezarius <tvl@lahfa.xyz>
Tested-by: BuildkiteCI
|
||
|---|---|---|
| .. | ||
| default.nix | ||
| OWNERS | ||
| parse_bucket_logs.rs | ||
| README.md | ||
archeology
This directory contains various scripts and helpers used for nix-archeology tasks.
It's used from some of the archeology instances, as well as standalone.