| See README.md Change-Id: I6a50e34398c42aabe3cceba160be006f1867eca4 Reviewed-on: https://cl.tvl.fyi/c/depot/+/7874 Reviewed-by: wpcarro <wpcarro@gmail.com> Autosubmit: wpcarro <wpcarro@gmail.com> Tested-by: BuildkiteCI | ||
|---|---|---|
| .. | ||
| main.py | ||
| parser.py | ||
| README.md | ||
| scanner.py | ||
Simple Select
- Simple Select is a less expressive but more ergonomic query language for tabular data than SQL.
- slxis a command-line tool for querying CSVs using the Simple Select query language.
Simple Select queries look like this: director:"Tarantino" OR director:"Scorsese".
Example
Say we have the following data in a CSV:
title,year,rating,director
"Spirited Away",2001,8.5,"Hayao Miyazaki"
Andhadhun,2018,8.1,"Sriram Raghavan"
Dangal,2016,8.3,"Sriram Raghavan"
"Avengers: Infinity War",2019,8.4,"Anthony Russo"
Alien,1979,8.4,"Ridley Scott"
...
We can invoke slx like so...
$ slx -f /tmp/movies.csv
...and then query using the REPL:
> director:/S.*m/ OR director:"Hayao"
Andhadhun       2018    8.1     1       Sriram Raghavan 0       1
Dangal  2016    8.3     1       Sriram Raghavan 0       1
Howls Moving Castle     2004    8.2     0       Hayao Miyazaki  1       1
Judgment at Nuremberg   1961    8.1     0       Stanley Kramer  0       0
Laputa: Castle in the Sky       1986    8.0     0       Hayao Miyazaki  1       1
Nausicaa of the Valley of the Wind      1984    8.0     0       Hayao Miyazaki  1       1
Network 1976    8.1     0       Sidney Lumet    0       0
Warning
Simple Select is not intended for production use. I wrote this as a toy project for my own consumption. There are quite a few bugs of which I'm aware and quite a few other features that I'd like to support but haven't had time to support just yet.
Why publish it then? Maybe this project will inspire drive-by contributions or other, better-implemented spin-offs.
Wish List
Speaking of drive-by contributions, here are some things that I'd like to support:
- Implicit ANDconjunctions (director:/Tarantino/ year:"2000"instead ofdirector:/Tarantino/ AND year:"2000")
- Support for types like numbers, dates (year:2000instead ofyear:"2000")
- slxshould support CSV and (at the very least) sqlite3 file formats (open to other formats as well)
- Regexes should be the default query primitive (director:Tarantinoinstead ofdirector:/Tarantino/)
- Improve parsing errors (including surfacing errors to the user)
- Support for reading from STDINand issuing queries from the command-line
- Unit-testing
- Configurable delimiters for output data (right now it's just \t)
- (Maybe) rewrite in a faster, more-type-safe languages (e.g. Rust)
I'm likely missing other FRs, bugs, so please file issues!