Hedgineer Podcast S2E3 - DuckDB

Published

September 10, 2025

Query.Farm joined Michael Watson on the Hedgineer Podcast.

“DuckDB, Apache Arrow, & the Future of Data Engineering”

In this episode, Michael Watson sits down with Rusty Conover, the world’s most prolific DuckDB extension developer, for an in-depth discussion on building next-generation real-time, large-scale data systems.

Rusty shares insights from his extensive experience in data engineering, including work at multi-manager hedge funds, and explains why DuckDB’s fast, in-process, C++-based architecture is redefining the big data landscape. They explore the growing DuckDB ecosystem, the Apache Arrow columnar format, and open table formats like Iceberg, Delta Lake, and the new DuckLake.

What You’ll Learn in This Episode

  • The DuckDB Revolution Why this “blazingly fast” in-process database is a game-changer that can simplify or replace entire ETL stacks.

  • A Tour of DuckDB Extensions A look inside some of Rusty’s 15 extensions, including Airport for Arrow integration, Crypto, ShellFS, and TextPlot.

  • Diving into Apache Arrow Understanding columnar in-memory data, zero-copy operations, and Arrow Flight for efficient data movement.

  • The Battle of Open Table Formats Comparing Iceberg, Delta Lake, and DuckLake’s database-centric approach.

  • DuckDB vs. The World How DuckDB competes with KDB for financial data, ClickHouse for analytics, and complements large-scale engines like Apache Spark.

  • Parquet Deep Dive Key differences between Parquet V1 and V2, plus modern compression strategies and encodings.

  • The Future of DuckDB Sneak peek at upcoming features like time travel and the MERGE INTO statement for simplified change data capture (CDC).

Hosted by Michael Watson, The Hedgineer Podcast explores AI technology and data in the hedge fund, asset management, and prop trading space.