DuckDB Extension Development
DuckDB extensions let you add new capabilities to DuckDB without modifying its core โ from custom functions to new table sources, file formats, storage layers, or external integrations.
At Query.Farm, we specialize in writing high-performance, production-ready DuckDB extensions for real-world data systems.
๐ง What is a DuckDB Extension?
Extensions are modular plugins that expand what DuckDB can do. Common extension types include:
- Custom SQL functions โ scalar, aggregate, table-producing
- New data sources โ REST APIs, file formats, remote systems
- New table functions โ like
read_json()
,read_parquet()
, or your own - Integration layers โ bridge DuckDB with Arrow, Flight, Kafka, S3, or internal services
- External libraries โ wrap or expose domain-specific C/C++ libraries to SQL
DuckDB loads extensions dynamically at runtime, so your logic stays decoupled from the DuckDB binary โ ideal for portability, deployment, and local-first environments.
๐ Extensions Weโve Built
We donโt just consult โ we build.
โ๏ธ Airport
Airport is a DuckDB extension that adds native support for DuckDB to interface with Apache Arrow Flight โ enabling high-performance reads and writes between DuckDB and remote Flight servers.
This extension turns DuckDB into a Apache Arrow Flight client, letting users stream columnar back and forth data across networks without conversions or middle layers.
Use cases include:
- Treating remote datasets as local SQL tables
- Building distributed analytics with zero ETL
- Accelerating ingestion pipelines using Arrow-native tools
๐ฆ Bitfilters
Bitfilters provides space-efficient probabilistic data structures for fast set membership testing and approximate duplicate detection. Use it to pre-filter expensive operations, optimize joins, and accelerate analytics on large datasets.
๐ Crypto
Crypto adds cryptographic hash functions and HMAC to DuckDB, supporting a wide range of secure hash algorithms for data integrity, authentication, and security workflows.
๐ Datasketches
Datasketches integrates DuckDB with Apache DataSketches for scalable, approximate analytics. Enables distinct counting, quantile estimation, set operations, and sketch serialization for real-time and distributed pipelines.
๐งฎ EvalExpr_Rhai
EvalExpr_Rhai enables inline evaluation of Rhai scripting language expressions within SQL queries. Supports custom logic, transformations, and dynamic calculations, all securely sandboxed.
๐ Fuzzycomplete
Fuzzycomplete is an alternative SQL completion engine using fuzzy string matching inspired by VS Code. Provides intuitive, context-aware table name suggestions across multiple databases and schemas.
#๏ธโฃ Hashfuncs
Hashfuncs offers high-performance non-cryptographic hash functions for indexing, partitioning, caching, and Bloom filter construction. Optimized for speed and uniform distribution.
๐บ๏ธ Lindel
Lindel brings advanced spatial indexing via Hilbert and Morton/Z-Order curves. Linearizes multi-dimensional data for efficient clustering, sorting, and range queriesโboosting performance for GIS and time-series workloads.
๐ฒ Marisa
Marisa adds MARISA trie support for fast string lookups, prefix searches, and predictive text operations. Enables efficient storage and querying of large string sets.
๐ก Radio
Radio enables real-time event integration for DuckDB, supporting WebSocket and Redis Pub/Sub. Query and broadcast live event streams with SQL.
โก Rapidfuzz
Rapidfuzz provides high-performance fuzzy string matching powered by RapidFuzz. Use it for similarity scoring, partial matching, and token-based comparisons in data cleaning, deduplication, and search.
๐ ShellFS
ShellFS treats shell commands as virtual filesโstream data into DuckDB from any shell output, and send query results directly to command-line tools. Perfect for ETL, automation, and UNIX-style workflows.
๐ฒ Stochastic
Stochastic adds comprehensive statistical distribution functions for probability calculations, random sampling, and advanced analytics. Supports a wide range of continuous and discrete distributions.
๐ Textplot
Textplot brings text-based data visualization directly to SQL. Create ASCII/Unicode bar charts, density plots, and more for quick data exploration, dashboards, and documentation.
๐ Tributary
Tributary integrates DuckDB with Apache Kafka for real-time streaming analytics. Enables direct ingestion and querying of Kafka topics, with future support for output and advanced stream processing.
๐ง Why Build a DuckDB Extension?
DuckDBโs embedded nature makes it ideal for data products, analytics pipelines, and developer tools. Extensions let you:
- Keep data in place โ query without moving to a warehouse
- Build SQL-first interfaces to domain-specific systems
- Expose custom logic to analysts, not just developers
- Avoid expensive or brittle ETL
Weโve helped clients query REST APIs, federate SQL over Arrow, and even embed model inference inside SQL.
๐งฑ How Query.Farm Can Help
We offer:
- โ Custom extension development โ tailored to your use case
- ๐ Architecture and performance consulting โ for embedding DuckDB
- ๐งช Prototyping and rapid development โ get results fast
- ๐งฑ C/C++ and Arrow expertise โ for efficient memory and I/O handling
- ๐ฆ Packaging and deployment support โ for both open source and proprietary extensions
Whether youโre building a startup around DuckDB or modernizing an internal data platform, weโll help you go from idea to shipped.
โ๏ธ Letโs Build Something
Interested in building your own extension or embedding DuckDB in your stack?
๐ฌ hello@query.farm โ weโd love to learn more.