WebMacro
Load DuckDB SQL macros (scalar and table) from a remote URL. Define a macro in a Git repo, gist, or any HTTP-served file; pull it into a session with one call. The point is centralized, versioned macro distribution — one source of truth for shared SQL helpers across machines and teammates.
Install
-- Install the extension
INSTALL webmacro FROM community;
-- Load it into your session
LOAD webmacro;
INSTALL webmacro FROM community;
LOAD webmacro;
-- Pull a macro definition from a URL you control
SELECT load_macro_from_url(
'https://raw.githubusercontent.com/your-org/sql-macros/main/search_posts.sql'
) AS res;
-- Use it like any other DuckDB macro
SELECT * FROM search_posts('qxip.bsky.social', text := 'quack'); Technical Overview
Why Use WebMacro?
Define a DuckDB macro once — in a Git repo, gist, or any HTTP-served file — and LOAD it into any DuckDB session by URL. The point is centralized, versioned macro distribution: one source of truth for shared SQL helpers across machines, teammates, and notebooks.
📦 What this extension is for
DuckDB macros are great for encapsulating reusable SQL — a normalization helper, a parameterized search, a custom aggregate pattern. The friction is distribution: copy-paste between notebooks, drift between machines, no version control on the SQL itself. WebMacro replaces copy-paste with a URL.
- • One source of truth: Host the macro definition where your team already collaborates — a Git repo, an internal wiki that serves raw text, an S3 bucket, a private gist. Everyone calling
load_macro_from_urlgets the same SQL. - • Version-controlled SQL: Pin a specific commit URL (e.g.
https://raw.githubusercontent.com/org/repo/<sha>/macros/foo.sql) and you've got reproducible macro loading. Reviewable in pull requests; rollback is a URL change. - • Session-start bootstrap: Load a handful of standard macros at the top of a notebook or script — one
SELECT load_macro_from_url(url) FROM (VALUES (...))call and the session has its toolbox. - • Both scalar and table macros: DuckDB supports scalar
MACROand tableMACRO. WebMacro loads either — whatever the URL's body parses as.
🔌 How it works
load_macro_from_url is a single scalar function. Internally: GET the URL, validate the body, register the macro on the current connection, return a status string. There is no caching, no manifest, no dependency graph — it's deliberately the simplest possible primitive.
- • Plain HTTP(S) GET: Uses
cpp-httplibwith OpenSSL for TLS. Redirects are followed automatically (so agist.github.com/...URL that 301s togist.githubusercontent.com/...works). The read timeout is 10 seconds. - • Substring validation: The fetched body must contain a
CREATE [OR REPLACE] [TEMP|TEMPORARY] MACROtoken. It's then scanned for dangerous keywords (DELETE,DROP,TRUNCATE,ALTER,GRANT,REVOKE,CREATE USER,EXEC,EXECUTE,SHUTDOWN,RESTART,DETACH); a hit raises an error. This is a substring check, not a SQL parser — treat it as a guardrail against accidentally fetching the wrong file, not a security boundary. - • Session-scoped registration: The macro is created on the current DuckDB connection (using whatever scope the SQL itself declared — persistent,
TEMP, orOR REPLACE). It lives until the session ends or the macro is dropped, just like a hand-typedCREATE MACRO. - • WASM build: In DuckDB-WASM the extension uses a WASM-friendly HTTP shim instead of OpenSSL httplib — same function surface, browser-fetch under the hood.
🛡️ Trust model
This extension fetches and executes SQL from a URL. Any URL you load is, in effect, code running in your DuckDB session. The trust posture below is the central caveat — read it before pointing this at anything you didn't author.
- • Only load from URLs you control: Your own Git repos, your team's internal artifact host, your own gists. Don't load from URLs found in chat messages, Stack Overflow, or random blog posts — same posture you'd use for
curl ... | bash. See OWASP — Code Injection. - • The dangerous-command filter is not a sandbox: It's a literal substring scan over the response body. It catches obvious things (
DROP TABLE,DELETE FROM) but does not understand SQL — comments, string literals, and creative formatting all bypass it. A motivated author can bypass the filter; the filter is there to catch wrong-file-by-accident, not malice. - • Pin URLs to immutable revisions: If you load from a Git host, prefer URLs with a commit SHA over branch names —
.../<sha>/macros/foo.sqlrather than.../main/macros/foo.sql. That way a compromised or rebased branch doesn't silently change what you load tomorrow. - • TLS is on by default — keep it on: HTTPS prevents network-level tampering between your DuckDB and the macro host. Plain
http://URLs work but should be avoided outside of localhost / lab networks. - • No cross-session caching: Every
load_macro_from_urlcall hits the network. If the upstream changes, the next session sees the change. Pin URLs for reproducibility.
🎯 Common Use Cases
Team macro library in a Git repo
Keep a macros/ directory in a shared Git repo, one .sql file per macro. At session start, load_macro_from_url the ones you need. Pull requests review macro changes; URLs pinned to a tag give you reproducible builds.
Reusable analyst toolbox
Domain-specific helpers (date bucketing for your fiscal calendar, custom string normalization, parameterized lookup macros) live at stable URLs and load at the top of every analyst notebook — no copy-paste drift.
Live demos and tutorials
Host an example macro at a stable URL; tutorials and blog posts can load_macro_from_url it in one line instead of pasting twenty lines of CREATE MACRO. Readers see the focal SQL, not the boilerplate.
Versioned SQL distribution to ephemeral workers
Short-lived workers that spin up DuckDB (Lambda, ETL containers) bootstrap their macro environment from a single trusted URL — no need to bake macro SQL into the image.
Deep Dive
Technical Details
What you can do with one query
Define a macro in a Git repo, point load_macro_from_url at the raw URL, and use it like any built-in DuckDB function:
SELECT load_macro_from_url(
'https://raw.githubusercontent.com/your-org/sql-macros/main/search_posts.sql'
);
-- 'Successfully loaded macro: search_posts'
SELECT *
FROM search_posts('qxip.bsky.social', text := 'quack');
The whole extension is one function. One URL is one macro. The interesting part isn’t the protocol — it’s that the URL becomes the unit of distribution: pull requests review the SQL, tags pin versions, and every machine in your team gets the same macro from the same place.
load_macro_from_url fetches arbitrary text from a URL and executes it as SQL in your DuckDB session. Treat every URL you load with the same posture you’d use for curl ... | bash:
- Only load from URLs you control or fully trust. Your own Git repos, your team’s internal artifact host, your own gists — not arbitrary URLs from chat messages, blog posts, or Stack Overflow snippets.
- The built-in dangerous-command filter is not a sandbox. It’s a substring scan for
DROP,DELETE,ALTER, etc. over the raw response body. Comments and string literals bypass it. A determined author can defeat it. The filter exists to catch the wrong file by accident, not malicious SQL. - Pin URLs to immutable revisions. Prefer
https://raw.githubusercontent.com/org/repo/<commit-sha>/macros/foo.sqlover.../main/macros/foo.sqlso a rebase or branch compromise doesn’t silently change what you load. - Keep HTTPS. Plain
http://URLs are vulnerable to in-flight tampering. The extension speaks TLS by default — use it.
In a multi-tenant or hosted DuckDB environment, consider not loading the WebMacro extension at all, or restricting which roles can use it. See OWASP — Code Injection for the threat model.
The single function
The whole API surface is one scalar function: load_macro_from_url(url VARCHAR) → VARCHAR — accepts a URL, returns the loaded macro’s name (or an error).
load_macro_from_url does exactly four things, in order:
GETthe URL (HTTPS supported via OpenSSL; redirects followed; 10 s read timeout).- Verify the response body contains a
CREATE MACROvariant —CREATE MACRO,CREATE OR REPLACE MACRO,CREATE TEMP MACRO,CREATE TEMPORARY MACRO, or theOR REPLACE TEMP/OR REPLACE TEMPORARYvariants. - Reject the response if it contains a dangerous keyword (
DELETE,DROP,TRUNCATE,ALTER,GRANT,REVOKE,CREATE USER,ALTER USER,DROP USER,DROP DATABASE,EXEC,EXECUTE,SHUTDOWN,RESTART,DETACH). - Execute the SQL on the current connection and return
Successfully loaded macro: <name>.
The macro is then callable by name in the session, like any hand-typed CREATE MACRO. There’s no separate registry to query and no unload — drop the macro with DROP MACRO if you need it gone before session end.
Hosting the macro
WebMacro doesn’t care where the URL lives, only that an HTTP GET against it returns a body that parses as a DuckDB macro. Common patterns:
- Git raw URL (recommended):
https://raw.githubusercontent.com/org/repo/<sha>/macros/foo.sql. Pinning to a commit SHA makes the macro deterministic forever. - Gist raw URL:
https://gist.githubusercontent.com/<user>/<id>/raw/foo.sql. Convenient for one-off sharing; harder to review than a Git PR. - Internal artifact server / S3 + presigned URL: any HTTP(S) endpoint that returns the SQL body works. TLS is honored.
- Self-hosted “macro registry”: a tiny static site at
macros.your-org.internal/mapping path → SQL. The whole thing can be a directory of.sqlfiles behind any static webserver.
Loading at session start
A typical bootstrap pattern: pull a known set of macros at the top of a notebook or script, fail loudly if any URL is unreachable.
INSTALL webmacro FROM community;
LOAD webmacro;
SELECT
url,
load_macro_from_url(url) AS status
FROM (VALUES
('https://raw.githubusercontent.com/your-org/sql-macros/main/search_posts.sql'),
('https://raw.githubusercontent.com/your-org/sql-macros/main/normalize_handle.sql'),
('https://raw.githubusercontent.com/your-org/sql-macros/main/active_users.sql')
) AS macros(url);
If any URL fails the fetch, validation, or dangerous-command checks, the query errors out — so a missing macro shows up immediately rather than as a “function not found” later.
Compared to alternatives
- Hand-rolled wrapper around
http_client+query. Possible —http_clientreturns the URL body, then youEXECUTEit. WebMacro collapses that to one statement and adds the (best-effort) safety filter. For everything beyond loading macros — calling REST APIs, posting form data, handling JSON responses — reach forhttp_clientdirectly. - Copy-pasting
CREATE MACROinto every notebook. The thing this extension exists to replace. Drift between notebooks is the failure mode. - Baking macros into a startup
.duckdbrc/ init script. Works for a single machine, doesn’t scale to a team or to ephemeral workers. WebMacro lets the URL be the artifact instead.
Caveats
- No caching. Every call hits the network. If you load the same macro 100 times in a session, that’s 100 HTTP requests.
- No dependency graph. If macro
Acalls macroB, you must loadBfirst. WebMacro doesn’t parse the SQL to discover dependencies. - No version negotiation. The URL is the version. If you want immutability, pin to a commit SHA or a content-addressed URL.
- Status is experimental. The function surface is small and stable in shape, but pin a known-good extension version in production setups.
API Reference
Function Documentation
Detailed documentation for each function including signatures, parameters, and examples.
load_macro_from_url
Signature
Parameters (Positional)
| Parameter | Type | Mode | Description |
|---|---|---|---|
url | VARCHAR | Positional | HTTP or HTTPS URL whose response body is a SQL CREATE MACRO statement (scalar or table macro). Redirects are followed; any non-200 response raises an error. |
Returns
Description
Fetch the URL, validate that the response body is a macro definition, register the macro in the current DuckDB connection, and return Successfully loaded macro: <name>. The macro is then callable by name like any other DuckDB macro until the session ends.
The response body must contain one of CREATE MACRO, CREATE OR REPLACE MACRO, CREATE TEMP MACRO, CREATE TEMPORARY MACRO, CREATE OR REPLACE TEMP MACRO, or CREATE OR REPLACE TEMPORARY MACRO. The fetched SQL is rejected if it contains other dangerous statements (e.g. DROP, DELETE, ALTER, ATTACH, EXECUTE) — but that filter is a substring check, not a parser, so the security model is trust the URL, not trust the filter. See the Trust model section.
Examples
Load a macro from a Git-hosted raw URL
SELECT load_macro_from_url(
'https://raw.githubusercontent.com/your-org/sql-macros/main/search_posts.sql'
) AS res;
-- 'Successfully loaded macro: search_posts' Use the macro after loading
SELECT load_macro_from_url('https://quacks.cc/r/search_posts');
SELECT *
FROM search_posts('qxip.bsky.social', text := 'quack'); Load several macros at session start
SELECT load_macro_from_url(url) AS status
FROM (VALUES
('https://raw.githubusercontent.com/your-org/sql-macros/main/search_posts.sql'),
('https://raw.githubusercontent.com/your-org/sql-macros/main/normalize_handle.sql'),
('https://raw.githubusercontent.com/your-org/sql-macros/main/active_users.sql')
) AS macros(url); Practical Examples
Cookbook
Real-world recipes and patterns for common use cases.
Install and load
INSTALL webmacro FROM community;
LOAD webmacro;
The extension registers a single function: load_macro_from_url. Everything below is a recipe around it.
Load a macro from a Git repo
The recommended pattern — host the SQL in a Git repo, load by raw URL:
SELECT load_macro_from_url(
'https://raw.githubusercontent.com/your-org/sql-macros/main/search_posts.sql'
) AS res;
-- 'Successfully loaded macro: search_posts'
-- Now use it like any DuckDB macro
SELECT *
FROM search_posts('qxip.bsky.social', text := 'quack');
The body of the URL must contain a CREATE MACRO statement (scalar or table macro). See load_macro_from_url for the validation rules.
Pin to an immutable commit
Use a commit SHA in the URL so a branch rebase or upstream change can’t silently swap the macro out from under you:
SELECT load_macro_from_url(
'https://raw.githubusercontent.com/your-org/sql-macros/'
|| 'a3f8d21c9e7b4f0e6d2a1b5c8e9f0a1b2c3d4e5f'
|| '/macros/search_posts.sql'
);
The commit SHA pins the SQL forever — the same URL today and a year from now returns the same bytes. This is the closest thing to a versioned macro release.
Load several macros at once
A common bootstrap pattern at the top of a notebook or script:
SELECT
url,
load_macro_from_url(url) AS status
FROM (VALUES
('https://raw.githubusercontent.com/your-org/sql-macros/main/search_posts.sql'),
('https://raw.githubusercontent.com/your-org/sql-macros/main/normalize_handle.sql'),
('https://raw.githubusercontent.com/your-org/sql-macros/main/active_users.sql')
) AS macros(url);
Any URL that fails the fetch, the macro-statement check, or the dangerous-command filter raises an error — the bootstrap halts, you find out at session start instead of at first use.
Load from a private host with auth
load_macro_from_url issues a plain unauthenticated GET. If your macro lives behind auth, the simplest pattern is a presigned URL — generate a short-lived signed URL upstream (S3, GCS, etc.) and load that:
SET VARIABLE macro_url = getenv('MACRO_PRESIGNED_URL');
SELECT load_macro_from_url(:macro_url);
For an internally-routable host where TLS or network ACLs are the access control (e.g. https://macros.your-org.internal/foo.sql), a plain URL is fine.
Use scalar and table macros
Either macro form loads the same way — the difference is how you call the result.
A scalar macro definition at https://example.com/macros/normalize_handle.sql:
CREATE OR REPLACE MACRO normalize_handle(h)
AS lower(trim(h, '@'));
A table macro at https://example.com/macros/active_users.sql:
CREATE OR REPLACE MACRO active_users(since_date) AS TABLE
SELECT user_id, last_seen
FROM users
WHERE last_seen >= since_date;
After loading both, scalar macros go in the SELECT list; table macros go in the FROM clause:
SELECT load_macro_from_url('https://example.com/macros/normalize_handle.sql');
SELECT load_macro_from_url('https://example.com/macros/active_users.sql');
SELECT normalize_handle('@Alice ') AS handle,
u.last_seen
FROM active_users(DATE '2026-01-01') u;
Replace a macro mid-session
CREATE OR REPLACE MACRO is the supported way to update a macro you’ve already loaded. If the URL’s body uses CREATE OR REPLACE MACRO, re-loading swaps the definition in place:
-- Edit the SQL upstream, then re-run:
SELECT load_macro_from_url(
'https://raw.githubusercontent.com/your-org/sql-macros/main/search_posts.sql'
);
-- 'Successfully loaded macro: search_posts' -- now the new definition
If the macro was created without OR REPLACE, you’ll need to DROP MACRO it first.
Pair with http_client for richer pipelines
WebMacro is a one-trick extension on purpose — it loads macros and nothing else. For HTTP request patterns it doesn’t cover (POST bodies, custom headers, reading non-macro JSON), pair with the http_client extension, which exposes general-purpose http_get / http_post for SQL-level HTTP.
-- Fetch a macro index from your registry, then load each entry
INSTALL http_client FROM community;
LOAD http_client;
WITH index AS (
SELECT (http_get('https://macros.your-org.internal/index.json')->>'body')::JSON AS j
)
SELECT load_macro_from_url(unnest(j->'$.macros[*].url')::VARCHAR)
FROM index;
Drop a loaded macro
A loaded macro is just a regular DuckDB macro — drop it like any other:
DROP MACRO search_posts;
DROP MACRO IF EXISTS active_users;
See DROP MACRO for the upstream syntax.
Platform Support
Compatibility
Extension availability may vary by platform and DuckDB version. Check below to ensure this extension supports your environment before installation.
Quick Facts
Platforms
Supported platform architectures
Distribute SQL macros from a URL
Install WebMacro to load DuckDB scalar and table macros straight from a Git-hosted URL — no copy-paste, one source of truth, and version control by virtue of the URL.