HTTP Client
Make HTTP GET and POST requests directly from DuckDB SQL. Returns a JSON object with `status`, `reason`, and `body` you can extract inline. Best fit: ad-hoc API calls, low-volume per-row enrichment, and joining a SQL query with a small REST endpoint. Synchronous and buffered β not a streaming or high-throughput HTTP client.
Install
-- Install the extension
INSTALL http_client FROM community;
-- Load it into your session
LOAD http_client;
-- GET a URL and read the status code
SELECT (http_get('https://httpbin.org/get')->>'status')::INT AS status;
-- POST a JSON body
SELECT http_post(
'https://httpbin.org/post',
headers => MAP{'accept': 'application/json'},
params => MAP{'name': 'alice'}
);
-- POST as application/x-www-form-urlencoded
SELECT http_post_form(
'https://httpbin.org/post',
headers => MAP{},
params => MAP{'limit': '10'}
); Technical Overview
Why Make HTTP Requests from DuckDB?
Issue HTTP requests directly from a SQL query β no app-layer Python or Node.js between DuckDB and the API. Best fit: ad-hoc REST calls, low-volume per-row enrichment, and one-off integrations. For high-throughput or streaming workloads, keep a real HTTP client.
π What this extension is for
Three scalar functions β http_get, http_post, and http_post_form β that issue a synchronous HTTP request and return the response as a JSON object you can extract with DuckDB's JSON operators.
- β’ Per-row API enrichment: Join a column against a remote REST endpoint with
http_getβ one HTTP call per row, response parsed inline as JSON. Practical for tens or hundreds of rows; impractical at scale because each call is synchronous. - β’ Ad-hoc API queries: Hit a JSON API from a one-off
SELECTwithout writing a script. Combine with->>to pull specific fields out of the response body. - β’ Form-encoded POSTs:
http_post_formsends parameters asapplication/x-www-form-urlencodedβ the body shape OAuth token endpoints, older REST APIs, and HTML form handlers expect. - β’ JSON POSTs:
http_postJSON-encodes theparamsMAPinto the request body. Pair withheaders => MAP{'content-type': 'application/json'}for REST APIs that consume JSON.
π How it works
Each call opens a TCP connection, sends the request, blocks until the response arrives, and returns a JSON struct. There is no ATTACH, no secrets type, no connection pool, no retry logic β the extension is intentionally a thin SQL surface over a synchronous HTTP request.
- β’ Response shape: Every function returns a JSON object with
status(the HTTP status code as a number),reason(the HTTP reason phrase β"OK","Not Found", etc.), andbody(the response body as a string). Extract fields with->>and cast as needed. - β’ Headers and params are MAPs: Both
http_postandhttp_post_formtakeheadersandparamsas DuckDBMAPliterals βMAP{'key': 'value', ...}. Headers go on the wire as-is; theparamsmap is encoded according to which POST function you called. - β’ Synchronous and blocking: DuckDB blocks until the response arrives or the underlying TCP call fails. Slow endpoints stall the query; there is no per-row timeout knob exposed. Use small
LIMITclauses or pre-aggregate keys before calling out. - β’ JSON-typed return value: The result is a
JSONvalue, so you can chain->and->>directly. Cast(http_get(...)->>'body')::JSONto dive further into a JSON response body.
π‘οΈ Scope and caveats
The extension is honest-by-design about what it doesn't do. Read this before pointing it at production traffic.
- β’ Experimental status: Marked experimental by the upstream README β "USE AT YOUR OWN RISK!". Pin a known-good extension version if you depend on it in pipelines.
- β’ GET and POST only: No
PUT/PATCH/DELETEin the documented surface. For other HTTP methods, shell out tocurlvia shellfs. - β’ No retries, no backoff, no streaming: The request is issued once and the entire response body is buffered into memory. Build retry / backoff in your application layer, or wrap the call in a CTE that you re-run on failure.
- β’ Credentials live in SQL: Bearer tokens and other secrets are passed as plain SQL values in the
headersmap β anything inlined appears in the cached query plan andEXPLAINoutput. Source from the environment withgetenvand bind viaSET VARIABLE, as covered in Securing credentials. - β’ One call per row: Per-row enrichment patterns issue one HTTP request per input row. Ten thousand rows means ten thousand sequential calls β slow, and the remote server will likely rate-limit you. Pre-aggregate or cache before joining.
π― Common Use Cases
Enrich a small DuckDB result with a REST API
Join a few hundred rows against an external endpoint with http_get β geocoding, score lookups, slow-changing reference data. Cache the JSON response into a DuckDB table so repeat queries don't re-hit the API.
Trigger a webhook from a SQL pipeline
End-of-pipeline http_post to a webhook URL with a JSON body summarizing what just ran. Synchronous, so the webhook either succeeds or the SQL transaction sees the error.
Exchange OAuth credentials
http_post_form against a token endpoint with grant_type and credential fields β the application/x-www-form-urlencoded shape OAuth's Bearer Token spec requires.
One-off API exploration
Replace curl + jq with one SQL statement: hit the endpoint, cast the body to JSON, project the fields you want. Keeps the whole investigation inside a single DuckDB session.
Deep Dive
Technical Details
What you can do with one query
The shortest path from βREST APIβ to βDuckDB rowβ:
SELECT ((http_get('https://api.github.com/repos/duckdb/duckdb')->>'body')::JSON)->>'$.stargazers_count' AS stars;
http_get issues a synchronous HTTP GET, returns the response as a JSON object with status / reason / body, and lets you cast and chain ->> right where you need the field. No Python wrapper, no curl in the middle, no temp file.
This extension is for ad-hoc API calls and low-volume per-row enrichment from inside SQL. It supports GET and POST (JSON or form-encoded) β no PUT / PATCH / DELETE in the documented surface.
Calls are synchronous: DuckDB blocks until the response arrives. There is no retry, no backoff, no streaming, no per-row timeout knob β the entire response body is buffered into memory. One HTTP request per input row, sequentially. For high-throughput integrations or streaming responses, keep a real HTTP client. For curl-only features (cookies, custom TLS, complex auth), use shellfs instead.
Status is experimental β pin a known-good extension version in any pipeline that depends on it.
Architecture
The extension is a thin SQL surface over a synchronous HTTP request. There is no ATTACH, no secrets type, no connection pool, no background fetcher β just three scalar functions:
http_get(url)βGETrequest, no headers parameter in the documented signature.http_post(url, headers, params)βPOSTwith theparamsMAPJSON-encoded into the body.http_post_form(url, headers, params)βPOSTwithparamsURL-encoded asapplication/x-www-form-urlencoded.
Each call returns a JSON object β {status, reason, body} β that you extract with DuckDBβs JSON operators. The body is always a string; cast it with (body)::JSON if the response is JSON and you want to drill in.
Because the function shape is stable and synchronous, the result of an http_* call composes cleanly into CTEs, lateral joins, and CREATE TABLE AS β anywhere a scalar JSON value is acceptable. The cost is the obvious one: one HTTP call per evaluation, in the order DuckDB chooses to evaluate the scalar.
Securing credentials
Any value passed inline to headers or params ends up in the cached query plan and EXPLAIN output. For Bearer tokens, OAuth client secrets, and other credentials, source them from the environment with getenv and bind them via SET VARIABLE:
SET VARIABLE api_token = getenv('API_TOKEN');
SELECT http_post(
'https://api.example.com/users',
headers => MAP{
'authorization': 'Bearer ' || :api_token,
'content-type': 'application/json'
},
params => MAP{'email': '[email protected]'}
);
getenv resolves at parse time, so the literal credential never appears in any cached query plan or session log. The same pattern works for OAuth client credentials passed to http_post_form β see the Cookbook for that recipe.
Scoping per-row calls
Per-row enrichment is the most common reason to reach for this extension, and the most common reason to regret it. A query like
SELECT u.id, http_get('https://api.example.com/score/' || u.id::VARCHAR) AS r
FROM users u;
issues one HTTP call per row, sequentially, with no connection reuse exposed at the SQL layer. For ten rows itβs instant; for ten thousand rows itβs slow and very likely rate-limited by the upstream server.
Practical guardrails:
- Pre-aggregate to a small set of distinct keys before joining against the API. If
usershas 100k rows but only 200 unique scores, fetch the 200 unique scores into a CTE first. LIMITaggressively during development to avoid accidentally hammering an endpoint while you iterate.- Cache results into a DuckDB table with
CREATE TABLE ASβ repeat queries hit the cache, not the API. - Sleep is not exposed. If you need rate limiting between calls, run the query in batches from your application layer rather than in one large SQL statement.
For genuinely large fetches, prefer one bulk call that returns many rows over many small per-row calls β most REST APIs offer a list endpoint for that reason.
Compared to alternatives
- shellfs +
curlβ pipecurloutput throughread_csv/read_json. Use shellfs when you need streaming reads of large responses (HTTP Client buffers the whole body), curl-only features (cookies, complex TLS, client certs), or shell-pipeline composition withawk/jq. Use HTTP Client when the response is small and you want the result as a JSON value directly inside aSELECT. - httpserver β the inverse direction: serve DuckDB query results as an HTTP API rather than consume one. Pair them when you want one DuckDB instance to fetch from upstream APIs and another to expose results to downstream consumers.
- Application-layer Python / Node + DuckDB β appropriate when you need retries, parallelism, connection reuse, or any HTTP feature this extension doesnβt expose. The tradeoff is the boilerplate and the round-trip back into SQL.
Reference
Extension Contents
Quick reference to all available functions and settings organized by category.
| Name | Description | |
|---|---|---|
| Requests Issue HTTP requests from DuckDB SQL. [ | ||
| http_get() | Issue an HTTP GET against url and return the response as JSON | |
| http_head() | Issue an HTTP HEAD request β same response metadata as a GET, but without a body | |
| http_post() | Issue an HTTP POST with the params map JSON-encoded as the request body | |
| http_post_form() | Issue an HTTP POST with the params map encoded as application/x-www-form-urlencoded β the body shape used by HTML forms, many older REST APIs, and OAuth 2 | |
API Reference
Function Documentation
Detailed documentation for each function including signatures, parameters, and examples.
http_get
Signature
Parameters (Positional)
| Parameter | Type | Mode | Description |
|---|---|---|---|
url | VARCHAR | Positional | The URL to fetch. Must include the scheme β https://... or http://.... |
Returns
A JSON object with status (the HTTP status code as a number), reason (the HTTP reason phrase, e.g. "OK" / "Not Found"), and body (the response body as a string).
Description
Issue an HTTP GET against url and return the response as JSON. Synchronous β DuckDB blocks until the response arrives or the request fails.
Extract fields from the result with DuckDB's JSON operators: ->> for string fields, -> for nested JSON. To dive into a JSON response body, cast it: ((http_get(...)->>'body')::JSON)->>'$.field'.
For REST endpoints that need headers (auth tokens, content negotiation), use http_post instead β http_get does not currently expose a headers parameter in the documented signature.
Examples
Read just the status code
SELECT (http_get('https://httpbin.org/get')->>'status')::INT AS status; Pull a field out of a JSON response body
WITH r AS (
SELECT http_get('https://httpbin.org/uuid') AS res
)
SELECT ((res->>'body')::JSON)->>'$.uuid' AS uuid
FROM r; Per-row enrichment against a REST API
SELECT u.id,
u.email,
((http_get('https://api.example.com/score/' || u.id::VARCHAR)->>'body')::JSON)->>'$.score' AS score
FROM users u
WHERE u.last_seen > now() - INTERVAL 1 HOUR
LIMIT 100; Related Functions
http_head
Signature
Parameters (Positional)
| Parameter | Type | Mode | Description |
|---|---|---|---|
url | VARCHAR | Positional | The URL to issue the HEAD request against. |
Returns
A JSON object with the same status / reason / body shape as http_get. For a well-behaved server the body is empty.
Description
Issue an HTTP HEAD request β same response metadata as a GET, but without a body. Use it for liveness checks or to test whether a URL exists before fetching the full payload with http_get.
This function is exposed by the extension binary but is not documented in the upstream README β the signature could change in a future release. Treat it as best-effort.
Examples
Liveness check β is the URL reachable?
SELECT (http_head('https://httpbin.org/')->>'status')::INT AS status; http_post
Signature
Parameters
| Parameter | Type | Mode | Description |
|---|---|---|---|
url | VARCHAR | Positional | The URL to POST to. |
headers | MAP(VARCHAR, VARCHAR) | Named | Request headers as a MAP β MAP{'authorization': 'Bearer ...', 'content-type': 'application/json'}. Sent on the wire as-is. |
params | MAP(VARCHAR, VARCHAR) | Named | Body parameters as a MAP. JSON-encoded into the request body β keys become object keys, values become JSON string values. |
Returns
A JSON object with status, reason, and body β same shape as http_get.
Description
Issue an HTTP POST with the params map JSON-encoded as the request body. Use this for REST APIs that consume JSON; pair it with headers => MAP{'content-type': 'application/json'} so the server knows to parse the body as JSON.
For application/x-www-form-urlencoded bodies (older REST APIs, OAuth token endpoints, HTML form handlers), use http_post_form instead.
Keep credentials out of cached query plans by binding them through SET VARIABLE + getenv β see Securing credentials.
Examples
POST a JSON body
SELECT http_post(
'https://httpbin.org/post',
headers => MAP{'accept': 'application/json',
'content-type': 'application/json'},
params => MAP{'name': 'alice', 'plan': 'pro'}
); Authorize with a Bearer token sourced from the environment
SET VARIABLE api_token = getenv('API_TOKEN');
SELECT http_post(
'https://api.example.com/users',
headers => MAP{
'authorization': 'Bearer ' || :api_token,
'content-type': 'application/json'
},
params => MAP{'email': '[email protected]'}
); Read just the status to verify the call succeeded
SELECT (http_post('https://httpbin.org/post',
headers => MAP{},
params => MAP{'k': 'v'})->>'status')::INT AS status; Related Functions
http_post_form
Signature
Parameters
| Parameter | Type | Mode | Description |
|---|---|---|---|
url | VARCHAR | Positional | The URL to POST to. |
headers | MAP(VARCHAR, VARCHAR) | Named | Request headers as a MAP. Pass MAP{} if you don't need any. |
params | MAP(VARCHAR, VARCHAR) | Named | Form fields as a MAP. Each entry is URL-encoded into the request body using application/x-www-form-urlencoded. |
Returns
A JSON object with status, reason, and body β same shape as http_get.
Description
Issue an HTTP POST with the params map encoded as application/x-www-form-urlencoded β the body shape used by HTML forms, many older REST APIs, and OAuth 2.0 token endpoints.
The extension sets the request's Content-Type header to application/x-www-form-urlencoded for you; you don't need to put it in headers manually.
For JSON request bodies, use http_post instead.
Examples
Submit a form to httpbin
SELECT http_post_form(
'https://httpbin.org/post',
headers => MAP{},
params => MAP{'limit': '10', 'cursor': 'abc'}
); Exchange OAuth client credentials for an access token
SET VARIABLE client_id = getenv('OAUTH_CLIENT_ID');
SET VARIABLE client_secret = getenv('OAUTH_CLIENT_SECRET');
SELECT ((http_post_form(
'https://oauth.example.com/token',
headers => MAP{},
params => MAP{
'grant_type': 'client_credentials',
'client_id': :client_id,
'client_secret': :client_secret
}
)->>'body')::JSON)->>'$.access_token' AS access_token; Related Functions
Practical Examples
Cookbook
Real-world recipes and patterns for common use cases.
Read the response shape
Every http_* function returns a JSON object with three fields. Pull them out with ->>:
WITH r AS (
SELECT http_get('https://httpbin.org/get') AS res
)
SELECT
(res->>'status')::INT AS status,
(res->>'reason') AS reason,
res->>'body' AS body
FROM r;
To dive into a JSON response body, cast it on the way out: ((res->>'body')::JSON)->>'$.field'. See http_get for the full signature.
GET a JSON API and project a field
SELECT ((http_get('https://api.github.com/repos/duckdb/duckdb')->>'body')::JSON)->>'$.stargazers_count' AS stars;
Same pattern for any read endpoint that returns JSON β fetch, cast body to JSON, project with ->>.
POST a JSON body
params becomes the JSON body. Set content-type so the server knows to parse it as JSON:
SELECT http_post(
'https://api.example.com/users',
headers => MAP{
'accept': 'application/json',
'content-type': 'application/json'
},
params => MAP{
'name': 'Alice',
'email': '[email protected]'
}
);
See http_post.
POST as application/x-www-form-urlencoded
For HTML forms, older REST APIs, and most OAuth 2.0 token endpoints. The extension sets the Content-Type for you β you donβt need to put it in headers:
SELECT http_post_form(
'https://oauth.example.com/token',
headers => MAP{},
params => MAP{
'grant_type': 'client_credentials',
'client_id': :client_id,
'client_secret': :client_secret
}
);
See http_post_form.
Authorize with a Bearer token from the environment
Source credentials from the environment with getenv and bind them via SET VARIABLE so they donβt leak into cached query plans or EXPLAIN output:
SET VARIABLE api_token = getenv('API_TOKEN');
SELECT http_post(
'https://api.example.com/comments',
headers => MAP{
'authorization': 'Bearer ' || :api_token,
'content-type': 'application/json'
},
params => MAP{'body': 'Hello from DuckDB'}
);
Exchange OAuth credentials for an access token
The classic application/x-www-form-urlencoded handshake β followed by a JSON parse on the response body to lift the access token out:
SET VARIABLE client_id = getenv('OAUTH_CLIENT_ID');
SET VARIABLE client_secret = getenv('OAUTH_CLIENT_SECRET');
WITH t AS (
SELECT http_post_form(
'https://oauth.example.com/token',
headers => MAP{},
params => MAP{
'grant_type': 'client_credentials',
'client_id': :client_id,
'client_secret': :client_secret
}
) AS res
)
SELECT ((res->>'body')::JSON)->>'$.access_token' AS access_token
FROM t;
Per-row enrichment from a REST API
SELECT
u.id,
u.email,
((http_get('https://api.example.com/score/' || u.id::VARCHAR)->>'body')::JSON)->>'$.score' AS score
FROM users u
WHERE u.last_seen > now() - INTERVAL 1 HOUR
LIMIT 100;
Practical for tens or hundreds of rows. For thousands, pre-aggregate to distinct keys first or cache the result into a DuckDB table β one HTTP call per row adds up fast.
Cache responses into a table
Hit the API once, query the cache forever after:
CREATE TABLE api_scores AS
SELECT
u.id,
((http_get('https://api.example.com/score/' || u.id::VARCHAR)->>'body')::JSON)->>'$.score' AS score
FROM (SELECT DISTINCT id FROM users) u;
Re-run this only when you want a refresh. Subsequent joins against api_scores are local and fast.
Conditional logic on the status code
Branch in SQL based on the HTTP status β use CASE on the status field:
WITH r AS (
SELECT http_get('https://api.example.com/things/42') AS res
)
SELECT
CASE (res->>'status')::INT
WHEN 200 THEN 'ok'
WHEN 404 THEN 'missing'
WHEN 429 THEN 'rate-limited'
ELSE 'unexpected'
END AS outcome,
res->>'body' AS body
FROM r;
Liveness check with HEAD
SELECT (http_head('https://httpbin.org/')->>'status')::INT AS status;
http_head returns the same response metadata as http_get but the server skips the body. Useful for βis this URL reachable?β probes before a heavier GET.
Trigger a webhook from a SQL pipeline
End a pipeline by POSTing a summary to a webhook β synchronous, so the webhook either succeeds or the surrounding statement sees the error:
SET VARIABLE hook = getenv('WEBHOOK_URL');
SELECT http_post(
:hook,
headers => MAP{'content-type': 'application/json'},
params => MAP{
'event': 'pipeline_complete',
'rows': (SELECT count(*)::VARCHAR FROM staging),
'finished': now()::VARCHAR
}
); Platform Support
Compatibility
Extension availability may vary by platform and DuckDB version. Check below to ensure this extension supports your environment before installation.
Quick Facts
Platforms
Supported platform architectures
Compiled binary sizes
| Platform | Architecture | Size |
|---|---|---|
| Linux | aarch64 | 5.85 MB |
| Linux | x86_64 | 5.94 MB |
| Linux (musl) | x86_64 | 5.12 MB |
| macOS | Apple Silicon | 4.24 MB |
| macOS | Intel | 4.28 MB |
| Windows | x86_64 | 9.46 MB |
| WASM | eh | 64.2 KB |
| WASM | mvp | 57.9 KB |
| WASM | threads | 58.0 KB |
Gzipped download size from the DuckDB community-extensions registry.
HTTP calls from a SELECT
Install HTTP Client to issue GET / POST requests inside DuckDB queries β handy for joining a database column against a REST API.