🎯

Bitfilters

Probabilistic set-membership filters as DuckDB columns. Quotient, XOR (8/16-bit), Binary Fuse (8/16-bit), and DuckDB-style Bloom filters β€” answer 'is X in this set?' in O(1) using a fraction of the memory of the source set, with tunable false-positive rates and zero false negatives.

Install

-- Install the extension
INSTALL bitfilters FROM community;

-- Load it into your session
LOAD bitfilters;

-- Build an xor8 filter from hashed keys (1% false-positive rate, ~9 bits per key)
WITH f AS (
  SELECT xor8_filter(hash(email)) AS filter FROM users
)
SELECT email,
       xor8_filter_contains(f.filter, hash(email)) AS might_be_known
FROM new_signups, f
LIMIT 10;

-- Per-partition Quotient filter β€” supports deletion and resize
SELECT bucket,
       quotient_filter(16, 4, hash(id)) AS filter
FROM ids
GROUP BY bucket;

Technical Overview

Why Use Bitfilters?

Build a probabilistic set-membership filter from a column, ship it as a small BLOB, and probe it from anywhere DuckDB runs. Answer "is this maybe in the set?" in O(1) using a fraction of the memory of the source set β€” with no false negatives and a tunable false-positive rate. The classic use is skipping work when the answer is definitely no: pre-filter joins, skip partitions, dodge cache misses.

🎯 What this extension is for

Bitfilters materializes four families of approximate-membership-query (AMQ) data structures as DuckDB aggregate functions. Build the filter once, store the BLOB as an ordinary column, probe it later β€” locally, in another DuckDB instance, or shipped over the network.

  • β€’ Skip a join when the answer is definitely no: Build a filter from the small "interesting" side of a join, then test the large side against it. False positives let an unwanted row through; false negatives never happen β€” so the cheap probe + expensive verify pattern is exact for correctness.
  • β€’ Per-partition data skipping: Materialize one filter per logical partition, store alongside metadata, and probe at query time to skip whole partitions that can't possibly contain the key. For high-cardinality lookups against fact tables this routinely skips 99%+ of partitions.
  • β€’ Compact set artifacts: A Binary Fuse 8 filter is under 9 bits per key β€” three orders of magnitude smaller than the source set for typical UUID workloads. Persist it in a Parquet column, ship it to another process, embed it in a manifest.
  • β€’ DuckDB-native join filters: One family emits filters in the same internal format DuckDB uses for its own runtime join filters β€” persist a filter, then re-use it in custom data-skipping logic the optimizer recognizes.

πŸ§ͺ Filter families

Four families, three design points. Pick on mutation, memory, and false-positive rate in that order β€” not the other way around.

  • β€’ Quotient β€” dynamic sets: The only family here that supports deletion and resize. Pick it when the underlying set evolves at runtime. Tunable slot count and FPR. See Quotient filter.
  • β€’ XOR β€” static, cross-system compatible: ~9 bits per key at ~0.39% FPR (8-bit) or ~18 bits per key at ~0.0015% FPR (16-bit) β€” about 20% smaller than a Bloom at the same FPR. Built once, queried many times. See the Xor filters paper (Graf & Lemire, 2019).
  • β€’ Binary Fuse β€” modern best-in-class: Typically lands under 9 bits per key with very fast construction and lookups β€” currently the most space-efficient static filter family. Default choice for read-heavy, append-only sets.
  • β€’ DuckDB Bloom β€” native compatibility: Speaks DuckDB's internal Bloom filter format and hash layout, so a persisted filter slots into the engine's own data-skipping path.

βš™οΈ How it works

Every filter constructor is a DuckDB aggregate function that takes UBIGINT hashes as input and returns a single BLOB. The result is dense, self-contained, and storable like any other column.

  • β€’ Hashes go in, BLOBs come out: Hash your input first β€” DuckDB's built-in hash() is fine for most cases. For specific algorithms (xxh3_64, MurmurHash3, rapidhash), pair with the hashfuncs extension. Filters distribute keys based on hash bits β€” quality of the hash matters.
  • β€’ Filters as columns: Every filter is just a BLOB. Store it in a regular column, persist to Parquet, send over the wire, hold in a CTE, partition by anything β€” there is no special filter type, only the matching *_contains probe function.
  • β€’ No false negatives, ever: The filter never lies about absence: a false from a probe means the key is definitively not in the source set. False positives are bounded β€” measure them by probing known non-members and confirm the rate matches the family.
  • β€’ Set-membership only β€” not cardinality: Bitfilters answers "is X in the set?". For cardinality estimation ("how many distinct items?") and set intersection sizes, use the sibling datasketches extension β€” it has Theta sketches and HyperLogLog for that workload. The two extensions cover complementary problems.

πŸ›‘οΈ Honest limitations

What probabilistic filters cannot do β€” read this before designing a system around one.

  • β€’ False positives are inherent: Every family here trades exactness for memory. A true from *_contains means probably present, not definitely present. If your downstream cannot tolerate even one wrong yes, pair the probe with an exact verify β€” the filter just narrows the candidate set.
  • β€’ Filters cannot enumerate: There's no way to list the members from a filter β€” it's set-membership only. If you need the elements themselves, store them; if you need cardinality, use datasketches.
  • β€’ Static families don't support updates: XOR, Binary Fuse, and DuckDB-Bloom filters are built once and immutable. Adding a single key requires a full rebuild. Use quotient_filter when the set genuinely churns, or rebuild on a cadence that matches your write throughput.
  • β€’ Tiny sets don't earn the filter: For a few thousand elements an in-memory hash set or a sorted array is faster and exact. Bitfilters earns its keep when the source set is large enough that storing the full set is what hurts β€” millions of keys, per-partition replication, network ship costs.

🎯 Common Use Cases

Pre-filter an expensive join or lookup

Build a filter from the small side, probe from the large side, verify only candidates that pass. The probe is O(1) and fits in cache; the verify only runs at the false-positive rate. Net: the join sees a fraction of the rows it would otherwise.

Per-partition data skipping

Materialize one Binary Fuse filter per partition at write time, store alongside partition metadata. At query time the probe tells the planner which partitions can possibly contain the key β€” skip the rest.

Ship a set as a tiny artifact

A 10M-key Binary Fuse 8 filter is ~11 MB. Persist it to S3, attach it to a manifest, distribute it to read replicas, embed it in a CDN β€” any place where shipping the full set would be too big.

Dedupe pre-check / cache hit prediction

Probe a filter built from "already-seen" keys before doing the expensive dedupe work. Misses are free; hits go to the exact path. Same pattern for predicting cache hits before issuing the round-trip.

Deep Dive

Technical Details

What is a probabilistic filter?

A probabilistic set-membership filter answers β€œIs X in this set?” with two carefully chosen guarantees:

  • No false negatives. If the filter says NO, the element is definitely not in the set.
  • Bounded false positives. If the filter says YES, the element is probably in the set β€” with a measurable probability of being wrong.

That’s a powerful tradeoff. You give up exactness in one direction in exchange for:

  • Memory β€” typically 1–2 bytes per element instead of storing the whole element.
  • Speed β€” O(1) probe, often a single cache line read.

Use cases include join pre-filtering, data skipping, dedupe pre-checks, cache hit/miss prediction, and routing requests to the right shard.

Filter family comparison

FamilyMemory (β‰ˆ, per element)False positive rateMutationBest for
Quotienttunable (q, r)~2^(βˆ’r)Add, delete, resizeDynamic sets
XOR-8~9 bits~0.39%StaticGeneral-purpose
XOR-16~18 bits~0.0015%StaticVery low FPR needed
Binary Fuse-8<9 bits~0.39%StaticSmallest static filter
Binary Fuse-16~18 bits~0.0015%StaticSmallest very-low-FPR filter
DuckDB BloomtunabletunableStaticDuckDB-native compat

For most read-heavy workloads, Binary Fuse 8 is the right default. Switch to Quotient when the underlying set changes; switch to a 16-bit variant when even 0.4% false positives are too many.

Hashing the input

Every filter takes UBIGINT hashes as input β€” the algorithm-quality of the hash matters because filters distribute keys based on the hash bits. Three options, in increasing fanciness:

  1. hash(value) β€” DuckDB’s built-in. Good distribution, very fast, no extension needed.
  2. xxh3_64(value) from hashfuncs β€” slightly faster, well-distributed, cross-language compatible.
  3. crypto_hash('sha2-256', value) from crypto cast to UBIGINT β€” only when adversarial inputs matter (cryptographic resistance to filter-poisoning).

For 99% of cases use option 1 or 2. Option 3 is correct but ~10Γ— slower.

Filters as columns

Every filter constructor is an aggregate function and produces a single BLOB. That makes filters first-class values you can:

  • Store in a regular column (CREATE TABLE filters (bucket INT, filter BLOB)).
  • Persist in Parquet for cross-process reuse.
  • Pass to other queries via CTE or table valued.
  • Combine via UNION (build per-partition filters separately, then probe each).

There’s no special filter type β€” just BLOB plus the matching *_contains function.

When not to use a filter

  • You actually need the elements, not membership. Filters can’t enumerate; they’re set-only.
  • The set is tiny. A 10K-element set in a HashMap is faster than a filter.
  • You can’t tolerate any false positives, including in a probe-then-verify pattern. (Though probe-then-verify makes filters exact for correctness purposes β€” you just pay the verify cost on FPR-rate of inputs.)
  • The set churns continuously. Even Quotient filters have construction cost; if you rebuild on every query, the filter isn’t earning its keep.

Reference

Extension Contents

Quick reference to all available functions and settings organized by category.

Name Description
Binary Fuse Filters

Binary Fuse is the modern best-in-class space-efficient family. <9 bits per key for the 8-bit variant, very fast construction and lookups. Static structure β€” pick this for read-heavy workloads where storage matters most.

binary_fuse16_filter() Aggregate that builds a 16-bit Binary Fuse filter β€” same family as binary_fuse8 but with much lower false-positive rate at the cost of ~2Γ— the memory
binary_fuse16_filter_contains() Test whether a hash value is a member of a 16-bit Binary Fuse filter
binary_fuse8_filter() Aggregate that builds an 8-bit Binary Fuse filter from hashed keys
binary_fuse8_filter_contains() Test whether a hash value is a member of an 8-bit Binary Fuse filter
DuckDB Bloom Filters

Bloom filters in DuckDB's native internal format. Use these when you need to materialize join filters or zone-map-style data-skipping structures that the DuckDB optimizer can consume directly.

bitfilters_duckdb_bloom_filter_create() Build a Bloom filter compatible with DuckDB's internal join filter format
bitfilters_duckdb_bloom_filter_probe() Test whether a hash value might be a member of a DuckDB-compatible Bloom filter built by bitfilters_duckdb_bloom_filter_create
bitfilters_duckdb_hash() Hash function compatible with DuckDB's internal Bloom filter probe layout
Quotient Filter

Quotient filters support deletion and resize. Use when the underlying set evolves at runtime β€” adds, removes, occasional rebuilds. Configurable via q (slot count = 2^q) and r (remainder bits per slot).

quotient_filter() Aggregate that builds a Quotient filter from hashed keys
quotient_filter_contains() Test whether a hash value is a member of a Quotient filter
XOR Filters

XOR filters are static β€” built once, queried many times. ~9 bits per key for the 8-bit variant (0.39% FPR), ~18 bits per key for the 16-bit (0.0015% FPR). ~20% smaller than Bloom for the same false-positive rate.

xor16_filter() Aggregate that builds a 16-bit XOR filter from hashed keys
xor16_filter_contains() Test whether a hash value is a member of a 16-bit XOR filter
xor8_filter() Aggregate that builds an 8-bit XOR filter from hashed keys
xor8_filter_contains() Test whether a hash value is a member of an 8-bit XOR filter

API Reference

Function Documentation

Detailed documentation for each function including signatures, parameters, and examples.

binary_fuse16_filter

Aggregate Function Binary Fuse Filters
Signature
binary_fuse16_filter(col0: UBIGINT) β†’ BLOB
Parameters (Positional)
Parameter Type Mode Description
col0 UBIGINT Positional
Returns
Description

Aggregate that builds a 16-bit Binary Fuse filter β€” same family as binary_fuse8 but with much lower false-positive rate at the cost of ~2Γ— the memory.

Examples
1

Build a Binary Fuse 16 filter

SELECT binary_fuse16_filter(hash(id)) AS filter FROM ids;

binary_fuse16_filter_contains

Scalar Function Binary Fuse Filters
Signature
binary_fuse16_filter_contains(col0: BLOB, col1: UBIGINT) β†’ BOOLEAN
Parameters (Positional)
Parameter Type Mode Description
col0 BLOB Positional
col1 UBIGINT Positional
Returns
Description

Test whether a hash value is a member of a 16-bit Binary Fuse filter.

Examples
1

Membership test

SELECT binary_fuse16_filter_contains(filter, hash(:candidate)) FROM filter_table;

binary_fuse8_filter

Aggregate Function Binary Fuse Filters
Signature
binary_fuse8_filter(col0: UBIGINT) β†’ BLOB
Parameters (Positional)
Parameter Type Mode Description
col0 UBIGINT Positional
Returns
Description

Aggregate that builds an 8-bit Binary Fuse filter from hashed keys. The most space-efficient filter family β€” typically <9 bits per key β€” with very fast construction and lookups. Static structure.

Examples
1

Build a Binary Fuse 8 filter

SELECT binary_fuse8_filter(hash(id)) AS filter FROM ids;

binary_fuse8_filter_contains

Scalar Function Binary Fuse Filters
Signature
binary_fuse8_filter_contains(col0: BLOB, col1: UBIGINT) β†’ BOOLEAN
Parameters (Positional)
Parameter Type Mode Description
col0 BLOB Positional
col1 UBIGINT Positional
Returns
Description

Test whether a hash value is a member of an 8-bit Binary Fuse filter.

Examples
1

Membership test

SELECT binary_fuse8_filter_contains(filter, hash(:candidate)) FROM filter_table;

bitfilters_duckdb_bloom_filter_create

Aggregate Function DuckDB Bloom Filters
Signature
bitfilters_duckdb_bloom_filter_create(col0: VARCHAR, col1: INTEGER, col2: UBIGINT) β†’ BLOB
Parameters (Positional)
Parameter Type Mode Description
col0 VARCHAR Positional
col1 INTEGER Positional
col2 UBIGINT Positional
Returns
Description

Build a Bloom filter compatible with DuckDB's internal join filter format. Produced filters can be used in custom join optimization paths and in zone-map-style data skipping.

Examples
1
SELECT bitfilters_duckdb_bloom_filter_create('v1.5.1', 16384, bitfilters_duckdb_hash('v1.5.1', key)) FROM tbl;

bitfilters_duckdb_bloom_filter_probe

Scalar Function DuckDB Bloom Filters
Signature
bitfilters_duckdb_bloom_filter_probe(col0: VARCHAR, col1: BLOB, varargs: ANY) β†’ BOOLEAN
Parameters (Positional)
Parameter Type Mode Description
col0 VARCHAR Positional
col1 BLOB Positional
varargs ANY Positional Varargs
Returns
Description

Test whether a hash value might be a member of a DuckDB-compatible Bloom filter built by bitfilters_duckdb_bloom_filter_create.

Examples
1
SELECT bitfilters_duckdb_bloom_filter_probe('v1.5.1', filter_blob, key_col) FROM tbl;
2
SELECT bitfilters_duckdb_bloom_filter_probe('v1.5.1', filter_blob, col1, col2) FROM tbl;

bitfilters_duckdb_hash

Scalar Function DuckDB Bloom Filters
Signature
bitfilters_duckdb_hash(col0: VARCHAR, varargs: ANY) β†’ UBIGINT
Parameters (Positional)
Parameter Type Mode Description
col0 VARCHAR Positional
varargs ANY Positional Varargs
Returns
Description

Hash function compatible with DuckDB's internal Bloom filter probe layout. Use this rather than a generic hash when feeding values into the DuckDB-style Bloom filter functions.

Examples
1
SELECT bitfilters_duckdb_hash('v1.5.1', 42);

Output

bitfilters_duckdb_hash('v1.5.1', 42)
7199933130570745587
2
SELECT bitfilters_duckdb_hash('v1.5.1', col1, col2) FROM tbl;

quotient_filter

Aggregate Function Quotient Filter
Signature
quotient_filter(col0: INTEGER, col1: INTEGER, col2: UBIGINT | BLOB) β†’ BLOB
Parameters (Positional)
Parameter Type Mode Description
col0 INTEGER Positional
col1 INTEGER Positional
col2 UBIGINT | BLOB
2 concrete types
BLOBUBIGINT
Positional
Returns
Description

Aggregate that builds a Quotient filter from hashed keys. q controls slot count (2^q slots), r controls remainder bits per slot. Quotient filters support deletion and resize β€” use them when the underlying set evolves over time.

Examples
1

Build a Quotient filter per partition

CREATE TABLE filters AS
SELECT id % 2 AS remainder,
       quotient_filter(16, 4, hash(id)) AS filter
FROM series_data
GROUP BY id % 2;

quotient_filter_contains

Scalar Function Quotient Filter
Signature
quotient_filter_contains(col0: BLOB, col1: UBIGINT) β†’ BOOLEAN
Parameters (Positional)
Parameter Type Mode Description
col0 BLOB Positional
col1 UBIGINT Positional
Returns
Description

Test whether a hash value is a member of a Quotient filter. Returns BOOLEAN β€” false means definitely-not-present; true means probably-present (within the filter's configured false-positive rate).

Examples
1

Membership test against a stored filter

SELECT quotient_filter_contains(filter, hash(12345)) AS might_exist
FROM filters WHERE remainder = 1;

xor16_filter

Aggregate Function XOR Filters
Signature
xor16_filter(col0: UBIGINT) β†’ BLOB
Parameters (Positional)
Parameter Type Mode Description
col0 UBIGINT Positional
Returns
Description

Aggregate that builds a 16-bit XOR filter from hashed keys. ~18 bits per key, ~0.0015% false positive rate. Use when you need a much lower FPR than xor8 and can afford 2Γ— the memory.

Examples
1

Build an xor16 filter for low FPR

SELECT xor16_filter(hash(id)) AS filter FROM ids;

xor16_filter_contains

Scalar Function XOR Filters
Signature
xor16_filter_contains(col0: BLOB, col1: UBIGINT) β†’ BOOLEAN
Parameters (Positional)
Parameter Type Mode Description
col0 BLOB Positional
col1 UBIGINT Positional
Returns
Description

Test whether a hash value is a member of a 16-bit XOR filter.

Examples
1

Membership test

SELECT xor16_filter_contains(filter, hash(:candidate)) FROM filter_table;

xor8_filter

Aggregate Function XOR Filters
Signature
xor8_filter(col0: UBIGINT) β†’ BLOB
Parameters (Positional)
Parameter Type Mode Description
col0 UBIGINT Positional
Returns
Description

Aggregate that builds an 8-bit XOR filter from hashed keys. ~9 bits per key, ~0.39% false positive rate. Static β€” built once, queried many times. ~20% smaller than Bloom for the same FPR.

Examples
1

Build an xor8 filter from a column of hashed IDs

SELECT xor8_filter(hash(id)) AS filter FROM ids;

xor8_filter_contains

Scalar Function XOR Filters
Signature
xor8_filter_contains(col0: BLOB, col1: UBIGINT) β†’ BOOLEAN
Parameters (Positional)
Parameter Type Mode Description
col0 BLOB Positional
col1 UBIGINT Positional
Returns
Description

Test whether a hash value is a member of an 8-bit XOR filter.

Examples
1

Membership test

SELECT xor8_filter_contains(filter, hash(:candidate)) FROM filter_table;

Practical Examples

Cookbook

Real-world recipes and patterns for common use cases.

Pick the Right Filter

NeedPick
Static set, smallest possible memorybinary_fuse8_filter
Static set, very low FPRbinary_fuse16_filter or xor16_filter
Dynamic set with deletion / resizequotient_filter
Maximum cross-system compatibilityxor8_filter
Plug into DuckDB’s join optimizerbitfilters_duckdb_bloom_filter_*

All filter constructors take UBIGINT hash values as input. Use DuckDB’s built-in hash() for general purpose, or pair with hashfuncs when you want a specific algorithm (xxh3_64, rapidhash, MurmurHash3).

Build a Filter

Filter constructors are aggregate functions β€” pass the hashes of every member of the set and the result is a single BLOB you can store anywhere DuckDB stores BLOBs:

-- Materialize a filter per partition
CREATE TABLE allowed_filters AS
SELECT bucket,
       binary_fuse8_filter(hash(id)) AS filter
FROM allowed_ids
GROUP BY bucket;

The filter is dense and self-contained β€” no separate hash function metadata needed at probe time, just call the matching *_contains function with the same hash.

Pre-Filter an Expensive Join

The classic use case β€” avoid an expensive lookup or cross-partition join when you can probabilistically rule rows out:

-- 1. Build a filter from "interesting" keys, once
CREATE TABLE interesting_filter AS
SELECT xor8_filter(hash(key)) AS filter FROM interesting_keys;

-- 2. Probe the filter to skip the expensive operation entirely for non-members
SELECT e.*
FROM events e, interesting_filter f
WHERE xor8_filter_contains(f.filter, hash(e.key))   -- fast: maybe-member
  AND e.key IN (SELECT key FROM interesting_keys);   -- slow: confirm if maybe

Because filters never produce false negatives, the probe-then-verify pattern is exact β€” you only pay the expensive IN for rows that pass the cheap filter.

Per-Partition Data Skipping

Materialize one filter per logical partition; use it to skip whole partitions at query time:

-- Build filters at write time
CREATE TABLE partition_filters AS
SELECT day,
       binary_fuse8_filter(hash(user_id)) AS users_in_partition
FROM events
GROUP BY day;

-- At query time, only scan partitions that might contain the user
SELECT day, COUNT(*) AS hits
FROM events
WHERE day IN (
  SELECT day FROM partition_filters
  WHERE binary_fuse8_filter_contains(users_in_partition, hash(:target_user))
)
  AND user_id = :target_user
GROUP BY day;

For large fact tables this can skip 99%+ of partitions when looking for a specific user/key.

Verify the False-Positive Rate

If you want to confirm the FPR on your data, count probes that the filter says yes to but the source set doesn’t actually contain:

-- Real members
SELECT COUNT(*) AS true_positives
FROM source_set s, the_filter f
WHERE xor8_filter_contains(f.filter, hash(s.id));

-- Non-members that the filter incorrectly accepts
SELECT COUNT(*) AS false_positives
FROM unrelated_set u, the_filter f
WHERE xor8_filter_contains(f.filter, hash(u.id));

For xor8, expect ~0.39% false positives across enough probes; binary_fuse8 lower; binary_fuse16 / xor16 nearly zero.

Quotient Filter β€” Dynamic Sets

Quotient filters are the only family here that supports deletion and resize. The q and r parameters tune the size/FPR tradeoff:

  • q β€” number of slot index bits. Filter holds up to ~2^q keys.
  • r β€” remainder bits per slot. Higher r β†’ lower FPR, more memory.

Approximate FPR β‰ˆ 2^(βˆ’r). At r = 4, expect ~6% FPR; at r = 8, ~0.4%.

SELECT bucket,
       quotient_filter(20, 8, hash(id)) AS filter   -- ~1M slots, ~0.4% FPR
FROM ids
GROUP BY bucket;

DuckDB-Compatible Bloom Filters

The bitfilters_duckdb_* family produces filters in DuckDB’s internal Bloom filter format β€” the same one the optimizer materializes during certain joins. Use these when you want to:

  • Persist a filter and re-use it later in a join hint.
  • Build per-zone Bloom filters for custom data-skipping logic.

All three functions take a constant version string (e.g. 'v1.5.1') as their first argument β€” bind-folded at parse time β€” so the produced hash and filter layout match the DuckDB build that will probe them later.

-- Build one DuckDB-shaped Bloom filter per bucket. The aggregate's middle
-- argument (16384 here) is the sector count β€” must be a power of two.
CREATE TABLE zone_filters AS
SELECT bucket,
       bitfilters_duckdb_bloom_filter_create(
         'v1.5.1',
         16384,
         bitfilters_duckdb_hash('v1.5.1', key)
       ) AS bf
FROM keys
GROUP BY bucket;

-- Probe the matching filter at scan time
SELECT *
FROM events e, zone_filters z
WHERE z.bucket = e.bucket
  AND bitfilters_duckdb_bloom_filter_probe('v1.5.1', z.bf, e.key);

bitfilters_duckdb_hash and bitfilters_duckdb_bloom_filter_create produce hashes / sector layouts that match the DuckDB build named by the version string β€” pin it to your runtime’s exact version.

Platform Support

Compatibility

Extension availability may vary by platform and DuckDB version. Check below to ensure this extension supports your environment before installation.

Quick Facts

Software License MIT
Pricing Free
Written In C++
Source Available Yes
View on GitHub
Usage
515,050+
loads in last 30 days

Platforms

Linux
Linux (musl)
macOS
Windows
WASM
Supported platform architectures
Linux: aarch64, x86_64
Linux (musl): x86_64
macOS: Apple Silicon, Intel
Windows: x86_64
WASM: eh, threads, mvp
Compiled binary sizes
Platform Architecture Size
Linux aarch64 2.72 MB
Linux x86_64 3.08 MB
Linux (musl) x86_64 2.92 MB
macOS Apple Silicon 1.88 MB
macOS Intel 2.19 MB
Windows x86_64 7.55 MB
WASM eh 35.6 KB
WASM mvp 32.0 KB
WASM threads 32.3 KB

Gzipped download size from the DuckDB community-extensions registry.

DuckDB Versions

Release calendar
Supported
v1.4.4 v1.5.2

Skip Expensive Operations Probabilistically

Install Bitfilters to materialize compact, fast set-membership filters as DuckDB blobs. Pre-filter joins, dodge cache misses, skip data partitions β€” all with measurable false-positive bounds.