A fast, local search engine built in Rust with vector embeddings and SQLite storage.
- 🔍 Full-Text + Semantic Search using embeddings generated and stored locally
- 📁 Local file indexing and search
- 🗄️ SQLite-based storage
- 📚 Both library and CLI interfaces
curl -sSL https://raw.githubusercontent.com/nnanto/localsearch/main/scripts/install.sh | bashirm https://raw.githubusercontent.com/nnanto/localsearch/main/scripts/install.ps1 | iexDownload the appropriate binary for your platform from the latest release:
curl -L https://github.com/nnanto/localsearch/releases/latest/download/localsearch-linux-x86_64.tar.gz | tar xz
sudo mv localsearch /usr/local/bin/curl -L https://github.com/nnanto/localsearch/releases/latest/download/localsearch-macos-x86_64.tar.gz | tar xz
sudo mv localsearch /usr/local/bin/curl -L https://github.com/nnanto/localsearch/releases/latest/download/localsearch-macos-aarch64.tar.gz | tar xz
sudo mv localsearch /usr/local/bin/- Download localsearch-windows-x86_64.zip
- Extract the ZIP file
- Add the extracted directory to your PATH environment variable
If you have Rust installed, you can build from source:
cargo install --git https://github.com/nnanto/localsearch --features cliOr clone and build:
git clone https://github.com/nnanto/localsearch.git
cd localsearch
cargo build --release --features cli
sudo cp target/release/localsearch /usr/local/bin/After installation, verify that the tool is working:
localsearch --helpYou should see the help output for the localsearch CLI tool.
To update to the latest version, simply re-run the installation command. The installer will replace the existing binary with the latest version.
sudo rm /usr/local/bin/localsearchRemove the installation directory and update your PATH environment variable to remove the localsearch directory.
If you get permission errors on Linux/macOS, make sure you're running the installation with appropriate permissions (using sudo when needed).
If the localsearch command is not found after installation, make sure the installation directory is in your PATH:
- Linux/macOS:
/usr/local/binshould be in your PATH - Windows: The installation directory should be added to your PATH environment variable
Some antivirus software may flag the binary as suspicious. This is a common issue with Rust binaries. You may need to add an exception for the localsearch binary.
# Index documents (uses system default directories)
localsearch index /path/to/documents
# Search for content
localsearch search "your query here"By default, localsearch uses system-appropriate directories:
- Cache: Model files are stored in the system cache directory (e.g.,
~/.cacheon Linux,~/Library/Cacheson macOS) - Database: SQLite database is stored in the application data directory (e.g.,
~/.local/shareon Linux,~/Library/Application Supporton macOS)
You can override these defaults:
# Use custom database location
localsearch index /path/to/documents --db /custom/path/to/database.db
# Use custom cache directory for embeddings
localsearch index /path/to/documents --cache-dir /custom/cache/path
# Use both custom paths
localsearch index /path/to/documents --db /custom/db.db --cache-dir /custom/cache
# Search with custom paths
localsearch search "query" --db /custom/db.db --cache-dir /custom/cacheYou can use your own local ONNX embedding models instead of the default pre-built models:
# Index with local model
localsearch index /path/to/documents \
--local-model-path /path/to/your/model.onnx \
--tokenizer-dir /path/to/tokenizer/directory \
--max-tokens 512
# Search with local model
localsearch search "your query" \
--local-model-path /path/to/your/model.onnx \
--tokenizer-dir /path/to/tokenizer/directory \
--max-tokens 512Required for local models:
--local-model-path: Path to your ONNX model file--tokenizer-dir: Directory containing:tokenizer.jsonconfig.jsonspecial_tokens_map.jsontokenizer_config.json
--max-tokens: (Optional) Maximum number of tokens (default: 512)
# Index JSON files (default)
localsearch index data.json --file-type json
# Index text files
localsearch index /path/to/text/files --file-type text# Different search types
localsearch search "query" --search-type semantic
localsearch search "query" --search-type fulltext
localsearch search "query" --search-type hybrid # default
# Limit results
localsearch search "query" --limit 5
# Pretty output
localsearch search "query" --prettyFilter search results to only include documents whose paths contain specific patterns:
# Search only in documents with "src" in the path
localsearch search "function" --path-filter "src"
# Search in multiple path patterns (OR logic)
localsearch search "test" --path-filter "src,test,doc"
# Mix different types of patterns
localsearch search "config" --path-filter "settings,config.json,main"
# Case-insensitive substring matching
localsearch search "data" --path-filter "API,database,models"
# Use with other options
localsearch search "error" --path-filter "src,lib" --search-type semantic --prettyHow Path Filtering Works:
- Uses case-insensitive substring matching
- Multiple patterns are separated by commas
- Results match if the path contains ANY of the specified patterns (OR logic)
- Examples:
"src"matches:src/main.rs,my_src_file.txt,project/src/lib.rs"src,test"matches:src/main.rs,tests/unit.rs,src_backup.txt
use localsearch::{SqliteLocalSearchEngine, LocalEmbedder, DocumentIndexer, LocalSearch, SearchType, DocumentRequest, LocalSearchDirs};
fn main() -> anyhow::Result<()> {
// Option 1: Use default system directories
let dirs = LocalSearchDirs::new();
let db_path = dirs.default_db_path();
let embedder = LocalEmbedder::new_with_default_model()?;
// Option 2: Use custom cache directory
// let custom_cache = std::path::PathBuf::from("/custom/cache");
// let embedder = LocalEmbedder::new_with_cache_dir(custom_cache)?;
// Option 3: Use your own local ONNX model and tokenizer
// let onnx_path = std::path::PathBuf::from("/path/to/your/model.onnx");
// let tokenizer_dir = std::path::PathBuf::from("/path/to/tokenizer/files");
// let embedder = LocalEmbedder::new_with_local_model(onnx_path, tokenizer_dir, Some(512))?;
let mut engine = SqliteLocalSearchEngine::new(&db_path.to_string_lossy(), Some(embedder))?;
// Index a document
engine.insert_document(DocumentRequest {
path: "some/unique/path".to_string(),
content: "This is example content".to_string(),
metadata: None,
})?;
// Search
let results = engine.search("example", SearchType::Hybrid, Some(10))?;
// Search with path filters (multiple patterns supported)
let filters = vec!["src".to_string(), "test".to_string()];
let filtered_results = engine.search("example", SearchType::Hybrid, Some(10), Some(&filters))?; Ok(())
}use localsearch::{SqliteLocalSearchEngine, LocalEmbedder, LocalSearch, SearchType};
fn search_examples(engine: &SqliteLocalSearchEngine) -> anyhow::Result<()> {
// Search all documents
let all_results = engine.search("rust programming", SearchType::Hybrid, Some(10), None)?;
// Search only in source files
let src_filter = vec!["src".to_string()];
let src_results = engine.search("function", SearchType::Semantic, Some(5), Some(&src_filter))?;
// Search in multiple path patterns
let multi_filters = vec!["src".to_string(), "test".to_string(), "doc".to_string()];
let filtered_results = engine.search(
"example code",
SearchType::Hybrid,
Some(10),
Some(&multi_filters)
)?;
// Search with specific file patterns
let file_filters = vec!["main.rs".to_string(), "lib.rs".to_string()];
let file_results = engine.search(
"implementation",
SearchType::FullText,
Some(3),
Some(&file_filters)
)?;
Ok(())
}
### Using Local ONNX Models
You can now use your own local ONNX embedding models instead of the default pre-built models:
```rust
use localsearch::LocalEmbedder;
use std::path::PathBuf;
// Method 1: Using a tokenizer directory
// Your tokenizer directory should contain:
// - tokenizer.json
// - config.json
// - special_tokens_map.json
// - tokenizer_config.json
let onnx_path = PathBuf::from("/path/to/your/model.onnx");
let tokenizer_dir = PathBuf::from("/path/to/tokenizer/directory");
let embedder = LocalEmbedder::new_with_local_model(onnx_path, tokenizer_dir, Some(512))?;
// Method 2: Using individual file paths
let embedder = LocalEmbedder::new_with_local_files(
PathBuf::from("/path/to/model.onnx"),
PathBuf::from("/path/to/tokenizer.json"),
PathBuf::from("/path/to/config.json"),
PathBuf::from("/path/to/special_tokens_map.json"),
PathBuf::from("/path/to/tokenizer_config.json"),
Some(512) // max_length
)?;Required Files for Local Models:
- ONNX Model File: Your embedding model in ONNX format (
.onnx) - Tokenizer Files: Four JSON files typically found with transformer models:
tokenizer.json- Main tokenizer configurationconfig.json- Model configurationspecial_tokens_map.json- Special token mappingstokenizer_config.json- Tokenizer-specific configuration
These files are commonly found in HuggingFace model repositories or can be exported when converting models to ONNX format.
# Clone the repository
git clone https://github.com/nnanto/localsearch.git
cd localsearch
# Run tests
cargo test
# Run CLI with features
cargo run --features cli -- search "query"MIT License - see LICENSE file for details.