Streaming
Kafka
Attach a Kafka cluster and consume topics as DuckDB tables — no Connect cluster, no Flink job, no bespoke consumer to babysit.
Overview
What Kafka does
The Kafka connector turns any Kafka-compatible cluster (Apache Kafka, Redpanda, MSK, Confluent Cloud) into an attachable SQL catalog. Topics show up as tables you can SELECT from, filter, and join against the rest of your data.
Consume from an offset or tail the live tail of a topic, decode JSON or Avro payloads into typed columns, and write rows back to a topic with a table function. The worker manages consumer groups, schema-registry lookups, and backpressure so your SQL stays simple.
Highlights
- Consume topics as tables with JSON or Avro decoding
- Tail live messages or replay from a specific offset
- Schema Registry integration for typed columns
- Produce rows back to a topic from SQL
- Works with Apache Kafka, Redpanda, MSK, and Confluent Cloud
Attach & query
-- Attach the Kafka worker from the Orchard
ATTACH 'kafka' AS kafka (
TYPE vgi,
LOCATION 'https://orchard.query.farm/kafka',
TOKEN getenv('ORCHARD_TOKEN')
);
-- Tail the last 10 minutes of an orders topic
SELECT key, value:customer_id AS customer_id, value:total AS total, ts
FROM read_topic('orders', from => 'latest')
WHERE ts > now() - INTERVAL 10 MINUTE
ORDER BY ts DESC;
Requires the VGI extension. Set ORCHARD_TOKEN after subscribing.
What you get
Tables & functions it exposes
Once attached, the connector adds these objects to your SQL session.
kafka.topics One row per topic with partition and offset metadata.
kafka.consumer_groups Lag and assignment per consumer group.
read_topic(topic, from) Stream a topic as rows, decoded to typed columns.
produce(topic, rows) Write a result set back to a Kafka topic.
kafka_credentials SASL / TLS credentials stored as a DuckDB secret.
Use cases
What teams build with Kafka
Ad-hoc debugging of what is actually flowing through a topic.
Join a live event stream against reference tables in your warehouse.
Backfill a table from a topic replay without standing up a pipeline.
Keep exploring
Other connectors
IMAP
Query mailboxes and messages over IMAP as tables.
Row / Column Level Security
Governance
Policy-based row filtering and column masking for any catalog.
HostQuery
Federation
Federate queries across remote database hosts.
Ready to attach Kafka?
Subscribe in minutes, or talk to us about a custom connector for your stack.