An experimental embedded SQL engine.

SlothDB is a from-scratch C++20 embedded SQL database in active development. Same model as DuckDB and SQLite: query Parquet, CSV, JSON, Arrow, Avro, SQLite, and Excel files directly with SQL, in-process. Early-stage; see the Status section below before treating any performance numbers as final.

-- Query files directly. No import step.
SELECT region, SUM(revenue)
FROM 'sales.parquet'
GROUP BY region;

Try in browser Ask in any language Status Docs

MIT · embedded · 7 file formats in core · latest release

Query files and cloud data directly.

Point SQL at CSV, Parquet, JSON, Avro, Excel, Arrow, or SQLite in-process. No import step. Local paths, https://, and public s3:// URLs all take the same SQL. Seven file formats in the core binary, no INSTALL / LOAD to run first.

Live demo ->

How we built SlothDB

Same embedded model as DuckDB and SQLite. Link it in, point SQL at files on disk. The defaults are different.

Live

CREATE LIVE VIEW caches the result and re-parses only the new bytes when a CSV grows. Logs, event streams, append-only exports - query them the same way twice, the second time is incremental.

Simple

Point SQL at a file: no CREATE TABLE, no COPY FROM, no extension to load first. Parquet, CSV, JSON, Avro, Arrow, SQLite, and Excel all work out of the box.

Fast

Morsel-driven parallelism, typed int64 JOIN hashes, SIMD CSV scanning, COUNT(*) over JOIN fused into the aggregate. Whether that translates into a meaningful speedup depends on the workload; see the Status section.

Edge-ready

A self-contained native binary (1-2 MB on Windows/Linux/macOS, x86_64 and arm64). -DSLOTHDB_EDGE=ON strips the engine to CSV / JSON / Parquet so the WASM fits under Cloudflare Workers' 1 MB script cap.

Extensible

Extensions link against a stable C ABI with numeric error codes (ErrorCode::TABLE_NOT_FOUND = 2000). Bindings built against 0.1.x stay compiling on later releases.

Free

Released under the MIT license. Embed it in commercial products, ship it in a desktop app, redistribute the binary. Source on GitHub, developed in the open.

Built for your stack

SlothDB ships with native file format readers, native clients, and runs on every major platform.

Formats

Storage

Platforms

Clients

SlothDB

Data science

Extensions

Ask in any language. Get SQL.

Type .ask at the slothdb> prompt. A rules parser handles catalog questions and common English shapes in under 10 ms with no model. Anything else falls through to a local Qwen2.5-Coder (0.5B for simple, 1.5B for analytic; lazy-downloaded on first use under -DSLOTHDB_ASK_MODEL=ON), which speaks 29 natural languages: English, Chinese, Spanish, French, German, Japanese, Korean, Russian, Arabic, Portuguese, Italian, Hindi, and more. Every generated statement is shown before it runs. Nothing leaves the machine. Set SLOTHDB_ASK_CONFIRM=1 to add a [Y/n] prompt before each run.

.ask pipeline: rules-first, router, two local Qwens, [Y/n] gate

tier	what	cost	covers
1	Rules parser (default)	<10 ms, no model	catalog, COUNT/SUM/AVG/GROUP BY/TOP-N, file-source intents
2	Qwen2.5-Coder 0.5B Q4_K_M	~200 ms, ~310 MB	open-ended SELECT / GROUP BY / filter
3	Qwen2.5-Coder 1.5B Q4_K_M	~500 ms, ~986 MB	window functions, ranking within groups, LAG/LEAD, joins

What this does not buy you: GPT-4-class SQL. Qwen at Q4 still hallucinates column names and misreads wide schemas; the SQL is always shown before it runs so you can catch wrong answers (SLOTHDB_ASK_CONFIRM=1 adds a keypress gate if you want a hard stop). Cumulative / running / moving aggregates refuse cleanly (engine gap, not model gap). For top-tier accuracy on complex joins, a cloud model behind your own API key still beats us. This is the option for when that is not allowed.

Full pipeline spec, router signals, refusal policy: docs/ASK.md.

Try it in 60 seconds

No files to find. The demo generates a 100 000-row CSV, runs three queries, and prints the timings; if DuckDB is on PATH it prints a side-by-side. Numbers depend on your hardware and DuckDB version.

Status

SlothDB is early-stage and experimental. The project is in active development. Treat any performance numbers on this site and in bench/ as anecdotal, on one workstation against one DuckDB version, with no standardised benchmark hardware behind them.

Install SlothDB

Single binary. No runtime dependencies. Pick a client.

Python

Full SlothDB engine as a pip-installable package. Connection API, pandas integration, context manager support.

pip install slothdb
python -c "import slothdb; slothdb.demo()"

Node.js / Browser

WebAssembly build - works in Node ≥18 and every modern browser. 1.3 MB wasm, zero native deps. Try it live.

npm install @slothdb/wasm

import { SlothDB } from '@slothdb/wasm';
const db = await SlothDB.create();
db.query("SELECT 1 AS n");

CLI binary

Prebuilt SlothDB shell for Linux (x86_64 / arm64), macOS, and Windows. No runtime dependencies.

# Linux / macOS
curl -fsSL https://raw.githubusercontent.com/\
  SouravRoy-ETL/slothdb/main/install.sh | bash

# Windows: grab slothdb.exe from
# github.com/SouravRoy-ETL/slothdb/releases

Build from source

Clone, CMake, and build. Produces a shared library, static library, and CLI binary.

git clone https://github.com/SouravRoy-ETL/slothdb
cd slothdb
cmake -B build -DSLOTHDB_BUILD_SHELL=ON
cmake --build build --config Release