DEVONthink has been my weapon of choice for years when it comes to organizing information on my computer. I recently changed to Linux, I need an alternative. I decided to build one myself. Here I try to record my considerations and learnings.
What to Build
I decided that I wanted to create a CLI semantic search tool that can also be used in lf to rank a variety of text files in ascending or descending order.
Progress
15-5-2025: Decided on a model and embedding engine
I opted all-MiniLM-L6-v2 because it has some neat characteristics that makes it a promising choice for a solution that’ll be running on a laptop without a dedicated GPU:
Feature | all-MiniLM-L6-v2 |
---|---|
Model size | ~90 MB (very small) |
Hidden size | 384 (lightweight) |
Layers | 6 (12+ in full BERT for comparison) |
Inference latency (CPU) | ~10–40ms per input (text-length dependent) |
RAM usage | Low (~<300MB total) |
ChatGPT even suggests it was created for desktop CLI tools, embedded systems, batch processing tools, and local-first apps with privacy focus.
As for the embedding engine, I’ll be using ONNX (Open Neural Network Exchange). ONNX might be more complicated to set up than a pure python implementation, but the shorter inference times on systems with only an integrated CPU will hopefully be worth it.
30-5-2025: Ran a full embedding pipeline using the ONNX model and a HuggingFace tokenizer in Rust
- Loaded all-MiniLM-L6-V2:
use tokenizers::Tokenizer;
let tokenizer = Tokenizer::from_file("models/minilm/tokenizer.json")?;