DuckDB Extension Development

Custom DuckDB extensions built for performance, portability, and production use.

DuckDB extensions let you add new capabilities to DuckDB without modifying its core โ€” from custom functions to new table sources, file formats, storage layers, or external integrations.

At Query.Farm, we specialize in writing high-performance, production-ready DuckDB extensions for real-world data systems.


๐Ÿ”ง What is a DuckDB Extension?

Extensions are modular plugins that expand what DuckDB can do. Common extension types include:

  • Custom SQL functions โ€” scalar, aggregate, table-producing
  • New data sources โ€” REST APIs, file formats, remote systems
  • New table functions โ€” like read_json(), read_parquet(), or your own
  • Integration layers โ€” bridge DuckDB with Arrow, Flight, Kafka, S3, or internal services
  • External libraries โ€” wrap or expose domain-specific C/C++ libraries to SQL

DuckDB loads extensions dynamically at runtime, so your logic stays decoupled from the DuckDB binary โ€” ideal for portability, deployment, and local-first environments.


๐Ÿš€ Extensions Weโ€™ve Built

We donโ€™t just consult โ€” we build.

โœˆ๏ธ Airport

Airport is a DuckDB extension that adds native support for DuckDB to interface with Apache Arrow Flight โ€” enabling high-performance reads and writes between DuckDB and remote Flight servers.

This extension turns DuckDB into a Apache Arrow Flight client, letting users stream columnar back and forth data across networks without conversions or middle layers.

Use cases include:

  • Treating remote datasets as local SQL tables
  • Building distributed analytics with zero ETL
  • Accelerating ingestion pipelines using Arrow-native tools

๐Ÿฆ† Bitfilters

Bitfilters provides space-efficient probabilistic data structures for fast set membership testing and approximate duplicate detection. Use it to pre-filter expensive operations, optimize joins, and accelerate analytics on large datasets.


๐Ÿ”’ Crypto

Crypto adds cryptographic hash functions and HMAC to DuckDB, supporting a wide range of secure hash algorithms for data integrity, authentication, and security workflows.


๐Ÿ“Š Datasketches

Datasketches integrates DuckDB with Apache DataSketches for scalable, approximate analytics. Enables distinct counting, quantile estimation, set operations, and sketch serialization for real-time and distributed pipelines.


๐Ÿงฎ EvalExpr_Rhai

EvalExpr_Rhai enables inline evaluation of Rhai scripting language expressions within SQL queries. Supports custom logic, transformations, and dynamic calculations, all securely sandboxed.


๐Ÿ” Fuzzycomplete

Fuzzycomplete is an alternative SQL completion engine using fuzzy string matching inspired by VS Code. Provides intuitive, context-aware table name suggestions across multiple databases and schemas.


#๏ธโƒฃ Hashfuncs

Hashfuncs offers high-performance non-cryptographic hash functions for indexing, partitioning, caching, and Bloom filter construction. Optimized for speed and uniform distribution.


๐Ÿ—บ๏ธ Lindel

Lindel brings advanced spatial indexing via Hilbert and Morton/Z-Order curves. Linearizes multi-dimensional data for efficient clustering, sorting, and range queriesโ€”boosting performance for GIS and time-series workloads.


๐ŸŒฒ Marisa

Marisa adds MARISA trie support for fast string lookups, prefix searches, and predictive text operations. Enables efficient storage and querying of large string sets.


๐Ÿ“ก Radio

Radio enables real-time event integration for DuckDB, supporting WebSocket and Redis Pub/Sub. Query and broadcast live event streams with SQL.


โšก Rapidfuzz

Rapidfuzz provides high-performance fuzzy string matching powered by RapidFuzz. Use it for similarity scoring, partial matching, and token-based comparisons in data cleaning, deduplication, and search.


๐Ÿš ShellFS

ShellFS treats shell commands as virtual filesโ€”stream data into DuckDB from any shell output, and send query results directly to command-line tools. Perfect for ETL, automation, and UNIX-style workflows.


๐ŸŽฒ Stochastic

Stochastic adds comprehensive statistical distribution functions for probability calculations, random sampling, and advanced analytics. Supports a wide range of continuous and discrete distributions.


๐Ÿ“ˆ Textplot

Textplot brings text-based data visualization directly to SQL. Create ASCII/Unicode bar charts, density plots, and more for quick data exploration, dashboards, and documentation.


๐ŸŒŠ Tributary

Tributary integrates DuckDB with Apache Kafka for real-time streaming analytics. Enables direct ingestion and querying of Kafka topics, with future support for output and advanced stream processing.


๐Ÿง  Why Build a DuckDB Extension?

DuckDBโ€™s embedded nature makes it ideal for data products, analytics pipelines, and developer tools. Extensions let you:

  • Keep data in place โ€” query without moving to a warehouse
  • Build SQL-first interfaces to domain-specific systems
  • Expose custom logic to analysts, not just developers
  • Avoid expensive or brittle ETL

Weโ€™ve helped clients query REST APIs, federate SQL over Arrow, and even embed model inference inside SQL.


๐Ÿงฑ How Query.Farm Can Help

We offer:

  • โœ… Custom extension development โ€” tailored to your use case
  • ๐Ÿ” Architecture and performance consulting โ€” for embedding DuckDB
  • ๐Ÿงช Prototyping and rapid development โ€” get results fast
  • ๐Ÿงฑ C/C++ and Arrow expertise โ€” for efficient memory and I/O handling
  • ๐Ÿ“ฆ Packaging and deployment support โ€” for both open source and proprietary extensions

Whether youโ€™re building a startup around DuckDB or modernizing an internal data platform, weโ€™ll help you go from idea to shipped.


โœ‰๏ธ Letโ€™s Build Something

Interested in building your own extension or embedding DuckDB in your stack?

๐Ÿ“ฌ hello@query.farm โ€” weโ€™d love to learn more.