🌐

HTTP Client

Make HTTP GET and POST requests directly from DuckDB SQL. Returns a JSON object with `status`, `reason`, and `body` you can extract inline. Best fit: ad-hoc API calls, low-volume per-row enrichment, and joining a SQL query with a small REST endpoint. Synchronous and buffered β€” not a streaming or high-throughput HTTP client.

Install

-- Install the extension
INSTALL http_client FROM community;

-- Load it into your session
LOAD http_client;

-- GET a URL and read the status code
SELECT (http_get('https://httpbin.org/get')->>'status')::INT AS status;

-- POST a JSON body
SELECT http_post(
  'https://httpbin.org/post',
  headers => MAP{'accept': 'application/json'},
  params  => MAP{'name': 'alice'}
);

-- POST as application/x-www-form-urlencoded
SELECT http_post_form(
  'https://httpbin.org/post',
  headers => MAP{},
  params  => MAP{'limit': '10'}
);

Technical Overview

Why Make HTTP Requests from DuckDB?

Issue HTTP requests directly from a SQL query β€” no app-layer Python or Node.js between DuckDB and the API. Best fit: ad-hoc REST calls, low-volume per-row enrichment, and one-off integrations. For high-throughput or streaming workloads, keep a real HTTP client.

πŸ“• What this extension is for

Three scalar functions β€” http_get, http_post, and http_post_form β€” that issue a synchronous HTTP request and return the response as a JSON object you can extract with DuckDB's JSON operators.

  • β€’ Per-row API enrichment: Join a column against a remote REST endpoint with http_get β€” one HTTP call per row, response parsed inline as JSON. Practical for tens or hundreds of rows; impractical at scale because each call is synchronous.
  • β€’ Ad-hoc API queries: Hit a JSON API from a one-off SELECT without writing a script. Combine with ->> to pull specific fields out of the response body.
  • β€’ Form-encoded POSTs: http_post_form sends parameters as application/x-www-form-urlencoded β€” the body shape OAuth token endpoints, older REST APIs, and HTML form handlers expect.
  • β€’ JSON POSTs: http_post JSON-encodes the params MAP into the request body. Pair with headers => MAP{'content-type': 'application/json'} for REST APIs that consume JSON.

πŸ”Œ How it works

Each call opens a TCP connection, sends the request, blocks until the response arrives, and returns a JSON struct. There is no ATTACH, no secrets type, no connection pool, no retry logic β€” the extension is intentionally a thin SQL surface over a synchronous HTTP request.

  • β€’ Response shape: Every function returns a JSON object with status (the HTTP status code as a number), reason (the HTTP reason phrase β€” "OK", "Not Found", etc.), and body (the response body as a string). Extract fields with ->> and cast as needed.
  • β€’ Headers and params are MAPs: Both http_post and http_post_form take headers and params as DuckDB MAP literals β€” MAP{'key': 'value', ...}. Headers go on the wire as-is; the params map is encoded according to which POST function you called.
  • β€’ Synchronous and blocking: DuckDB blocks until the response arrives or the underlying TCP call fails. Slow endpoints stall the query; there is no per-row timeout knob exposed. Use small LIMIT clauses or pre-aggregate keys before calling out.
  • β€’ JSON-typed return value: The result is a JSON value, so you can chain -> and ->> directly. Cast (http_get(...)->>'body')::JSON to dive further into a JSON response body.

πŸ›‘οΈ Scope and caveats

The extension is honest-by-design about what it doesn't do. Read this before pointing it at production traffic.

  • β€’ Experimental status: Marked experimental by the upstream README β€” "USE AT YOUR OWN RISK!". Pin a known-good extension version if you depend on it in pipelines.
  • β€’ GET and POST only: No PUT / PATCH / DELETE in the documented surface. For other HTTP methods, shell out to curl via shellfs.
  • β€’ No retries, no backoff, no streaming: The request is issued once and the entire response body is buffered into memory. Build retry / backoff in your application layer, or wrap the call in a CTE that you re-run on failure.
  • β€’ Credentials live in SQL: Bearer tokens and other secrets are passed as plain SQL values in the headers map β€” anything inlined appears in the cached query plan and EXPLAIN output. Source from the environment with getenv and bind via SET VARIABLE, as covered in Securing credentials.
  • β€’ One call per row: Per-row enrichment patterns issue one HTTP request per input row. Ten thousand rows means ten thousand sequential calls β€” slow, and the remote server will likely rate-limit you. Pre-aggregate or cache before joining.

🎯 Common Use Cases

Enrich a small DuckDB result with a REST API

Join a few hundred rows against an external endpoint with http_get β€” geocoding, score lookups, slow-changing reference data. Cache the JSON response into a DuckDB table so repeat queries don't re-hit the API.

Trigger a webhook from a SQL pipeline

End-of-pipeline http_post to a webhook URL with a JSON body summarizing what just ran. Synchronous, so the webhook either succeeds or the SQL transaction sees the error.

Exchange OAuth credentials

http_post_form against a token endpoint with grant_type and credential fields β€” the application/x-www-form-urlencoded shape OAuth's Bearer Token spec requires.

One-off API exploration

Replace curl + jq with one SQL statement: hit the endpoint, cast the body to JSON, project the fields you want. Keeps the whole investigation inside a single DuckDB session.

Deep Dive

Technical Details

What you can do with one query

The shortest path from β€œREST API” to β€œDuckDB row”:

SELECT ((http_get('https://api.github.com/repos/duckdb/duckdb')->>'body')::JSON)->>'$.stargazers_count' AS stars;

http_get issues a synchronous HTTP GET, returns the response as a JSON object with status / reason / body, and lets you cast and chain ->> right where you need the field. No Python wrapper, no curl in the middle, no temp file.

Best fit, scoped honestly

This extension is for ad-hoc API calls and low-volume per-row enrichment from inside SQL. It supports GET and POST (JSON or form-encoded) β€” no PUT / PATCH / DELETE in the documented surface.

Calls are synchronous: DuckDB blocks until the response arrives. There is no retry, no backoff, no streaming, no per-row timeout knob β€” the entire response body is buffered into memory. One HTTP request per input row, sequentially. For high-throughput integrations or streaming responses, keep a real HTTP client. For curl-only features (cookies, custom TLS, complex auth), use shellfs instead.

Status is experimental β€” pin a known-good extension version in any pipeline that depends on it.

Architecture

The extension is a thin SQL surface over a synchronous HTTP request. There is no ATTACH, no secrets type, no connection pool, no background fetcher β€” just three scalar functions:

Each call returns a JSON object β€” {status, reason, body} β€” that you extract with DuckDB’s JSON operators. The body is always a string; cast it with (body)::JSON if the response is JSON and you want to drill in.

Because the function shape is stable and synchronous, the result of an http_* call composes cleanly into CTEs, lateral joins, and CREATE TABLE AS β€” anywhere a scalar JSON value is acceptable. The cost is the obvious one: one HTTP call per evaluation, in the order DuckDB chooses to evaluate the scalar.

Securing credentials

Any value passed inline to headers or params ends up in the cached query plan and EXPLAIN output. For Bearer tokens, OAuth client secrets, and other credentials, source them from the environment with getenv and bind them via SET VARIABLE:

SET VARIABLE api_token = getenv('API_TOKEN');

SELECT http_post(
  'https://api.example.com/users',
  headers => MAP{
    'authorization': 'Bearer ' || :api_token,
    'content-type':  'application/json'
  },
  params  => MAP{'email': '[email protected]'}
);

getenv resolves at parse time, so the literal credential never appears in any cached query plan or session log. The same pattern works for OAuth client credentials passed to http_post_form β€” see the Cookbook for that recipe.

Scoping per-row calls

Per-row enrichment is the most common reason to reach for this extension, and the most common reason to regret it. A query like

SELECT u.id, http_get('https://api.example.com/score/' || u.id::VARCHAR) AS r
FROM users u;

issues one HTTP call per row, sequentially, with no connection reuse exposed at the SQL layer. For ten rows it’s instant; for ten thousand rows it’s slow and very likely rate-limited by the upstream server.

Practical guardrails:

  • Pre-aggregate to a small set of distinct keys before joining against the API. If users has 100k rows but only 200 unique scores, fetch the 200 unique scores into a CTE first.
  • LIMIT aggressively during development to avoid accidentally hammering an endpoint while you iterate.
  • Cache results into a DuckDB table with CREATE TABLE AS β€” repeat queries hit the cache, not the API.
  • Sleep is not exposed. If you need rate limiting between calls, run the query in batches from your application layer rather than in one large SQL statement.

For genuinely large fetches, prefer one bulk call that returns many rows over many small per-row calls β€” most REST APIs offer a list endpoint for that reason.

Compared to alternatives

  • shellfs + curl β€” pipe curl output through read_csv / read_json. Use shellfs when you need streaming reads of large responses (HTTP Client buffers the whole body), curl-only features (cookies, complex TLS, client certs), or shell-pipeline composition with awk / jq. Use HTTP Client when the response is small and you want the result as a JSON value directly inside a SELECT.
  • httpserver β€” the inverse direction: serve DuckDB query results as an HTTP API rather than consume one. Pair them when you want one DuckDB instance to fetch from upstream APIs and another to expose results to downstream consumers.
  • Application-layer Python / Node + DuckDB β€” appropriate when you need retries, parallelism, connection reuse, or any HTTP feature this extension doesn’t expose. The tradeoff is the boilerplate and the round-trip back into SQL.

Reference

Extension Contents

Quick reference to all available functions and settings organized by category.

Name Description
Requests

Issue HTTP requests from DuckDB SQL. [http_get](#http_get) for reads, [http_post](#http_post) for JSON-body POSTs, [http_post_form](#http_post_form) for application/x-www-form-urlencoded POSTs. Each returns a JSON object with status, reason, and body.

http_get() Issue an HTTP GET against url and return the response as JSON
http_head() Issue an HTTP HEAD request β€” same response metadata as a GET, but without a body
http_post() Issue an HTTP POST with the params map JSON-encoded as the request body
http_post_form() Issue an HTTP POST with the params map encoded as application/x-www-form-urlencoded β€” the body shape used by HTML forms, many older REST APIs, and OAuth 2

API Reference

Function Documentation

Detailed documentation for each function including signatures, parameters, and examples.

http_get

Scalar Function Requests
Signature
http_get(url: VARCHAR) β†’ JSON
Parameters (Positional)
Parameter Type Mode Description
url VARCHAR Positional The URL to fetch. Must include the scheme β€” https://... or http://....
Returns

A JSON object with status (the HTTP status code as a number), reason (the HTTP reason phrase, e.g. "OK" / "Not Found"), and body (the response body as a string).

Description

Issue an HTTP GET against url and return the response as JSON. Synchronous β€” DuckDB blocks until the response arrives or the request fails.

Extract fields from the result with DuckDB's JSON operators: ->> for string fields, -> for nested JSON. To dive into a JSON response body, cast it: ((http_get(...)->>'body')::JSON)->>'$.field'.

For REST endpoints that need headers (auth tokens, content negotiation), use http_post instead β€” http_get does not currently expose a headers parameter in the documented signature.

Examples
1

Read just the status code

SELECT (http_get('https://httpbin.org/get')->>'status')::INT AS status;
2

Pull a field out of a JSON response body

WITH r AS (
  SELECT http_get('https://httpbin.org/uuid') AS res
)
SELECT ((res->>'body')::JSON)->>'$.uuid' AS uuid
FROM r;
3

Per-row enrichment against a REST API

SELECT u.id,
       u.email,
       ((http_get('https://api.example.com/score/' || u.id::VARCHAR)->>'body')::JSON)->>'$.score' AS score
FROM users u
WHERE u.last_seen > now() - INTERVAL 1 HOUR
LIMIT 100;

http_head

Scalar Function Requests
Signature
http_head(url: VARCHAR) β†’ JSON
Parameters (Positional)
Parameter Type Mode Description
url VARCHAR Positional The URL to issue the HEAD request against.
Returns

A JSON object with the same status / reason / body shape as http_get. For a well-behaved server the body is empty.

Description

Issue an HTTP HEAD request β€” same response metadata as a GET, but without a body. Use it for liveness checks or to test whether a URL exists before fetching the full payload with http_get.

This function is exposed by the extension binary but is not documented in the upstream README β€” the signature could change in a future release. Treat it as best-effort.

Examples
1

Liveness check β€” is the URL reachable?

SELECT (http_head('https://httpbin.org/')->>'status')::INT AS status;

http_post

Scalar Function Requests
Signature
http_post(url: VARCHAR, headers := MAP(VARCHAR, VARCHAR), params := MAP(VARCHAR, VARCHAR)) β†’ JSON
Parameters
Parameter Type Mode Description
url VARCHAR Positional The URL to POST to.
headers MAP(VARCHAR, VARCHAR) Named Request headers as a MAP β€” MAP{'authorization': 'Bearer ...', 'content-type': 'application/json'}. Sent on the wire as-is.
params MAP(VARCHAR, VARCHAR) Named Body parameters as a MAP. JSON-encoded into the request body β€” keys become object keys, values become JSON string values.
Returns

A JSON object with status, reason, and body β€” same shape as http_get.

Description

Issue an HTTP POST with the params map JSON-encoded as the request body. Use this for REST APIs that consume JSON; pair it with headers => MAP{'content-type': 'application/json'} so the server knows to parse the body as JSON.

For application/x-www-form-urlencoded bodies (older REST APIs, OAuth token endpoints, HTML form handlers), use http_post_form instead.

Keep credentials out of cached query plans by binding them through SET VARIABLE + getenv β€” see Securing credentials.

Examples
1

POST a JSON body

SELECT http_post(
  'https://httpbin.org/post',
  headers => MAP{'accept': 'application/json',
                 'content-type': 'application/json'},
  params  => MAP{'name': 'alice', 'plan': 'pro'}
);
2

Authorize with a Bearer token sourced from the environment

SET VARIABLE api_token = getenv('API_TOKEN');

SELECT http_post(
  'https://api.example.com/users',
  headers => MAP{
    'authorization': 'Bearer ' || :api_token,
    'content-type': 'application/json'
  },
  params  => MAP{'email': '[email protected]'}
);
3

Read just the status to verify the call succeeded

SELECT (http_post('https://httpbin.org/post',
                  headers => MAP{},
                  params  => MAP{'k': 'v'})->>'status')::INT AS status;

http_post_form

Scalar Function Requests
Signature
http_post_form(url: VARCHAR, headers := MAP(VARCHAR, VARCHAR), params := MAP(VARCHAR, VARCHAR)) β†’ JSON
Parameters
Parameter Type Mode Description
url VARCHAR Positional The URL to POST to.
headers MAP(VARCHAR, VARCHAR) Named Request headers as a MAP. Pass MAP{} if you don't need any.
params MAP(VARCHAR, VARCHAR) Named Form fields as a MAP. Each entry is URL-encoded into the request body using application/x-www-form-urlencoded.
Returns

A JSON object with status, reason, and body β€” same shape as http_get.

Description

Issue an HTTP POST with the params map encoded as application/x-www-form-urlencoded β€” the body shape used by HTML forms, many older REST APIs, and OAuth 2.0 token endpoints.

The extension sets the request's Content-Type header to application/x-www-form-urlencoded for you; you don't need to put it in headers manually.

For JSON request bodies, use http_post instead.

Examples
1

Submit a form to httpbin

SELECT http_post_form(
  'https://httpbin.org/post',
  headers => MAP{},
  params  => MAP{'limit': '10', 'cursor': 'abc'}
);
2

Exchange OAuth client credentials for an access token

SET VARIABLE client_id     = getenv('OAUTH_CLIENT_ID');
SET VARIABLE client_secret = getenv('OAUTH_CLIENT_SECRET');

SELECT ((http_post_form(
  'https://oauth.example.com/token',
  headers => MAP{},
  params  => MAP{
    'grant_type':    'client_credentials',
    'client_id':     :client_id,
    'client_secret': :client_secret
  }
)->>'body')::JSON)->>'$.access_token' AS access_token;

Practical Examples

Cookbook

Real-world recipes and patterns for common use cases.

Read the response shape

Every http_* function returns a JSON object with three fields. Pull them out with ->>:

WITH r AS (
  SELECT http_get('https://httpbin.org/get') AS res
)
SELECT
  (res->>'status')::INT  AS status,
  (res->>'reason')       AS reason,
   res->>'body'          AS body
FROM r;

To dive into a JSON response body, cast it on the way out: ((res->>'body')::JSON)->>'$.field'. See http_get for the full signature.

GET a JSON API and project a field

SELECT ((http_get('https://api.github.com/repos/duckdb/duckdb')->>'body')::JSON)->>'$.stargazers_count' AS stars;

Same pattern for any read endpoint that returns JSON β€” fetch, cast body to JSON, project with ->>.

POST a JSON body

params becomes the JSON body. Set content-type so the server knows to parse it as JSON:

SELECT http_post(
  'https://api.example.com/users',
  headers => MAP{
    'accept':       'application/json',
    'content-type': 'application/json'
  },
  params  => MAP{
    'name':  'Alice',
    'email': '[email protected]'
  }
);

See http_post.

POST as application/x-www-form-urlencoded

For HTML forms, older REST APIs, and most OAuth 2.0 token endpoints. The extension sets the Content-Type for you β€” you don’t need to put it in headers:

SELECT http_post_form(
  'https://oauth.example.com/token',
  headers => MAP{},
  params  => MAP{
    'grant_type':    'client_credentials',
    'client_id':     :client_id,
    'client_secret': :client_secret
  }
);

See http_post_form.

Authorize with a Bearer token from the environment

Source credentials from the environment with getenv and bind them via SET VARIABLE so they don’t leak into cached query plans or EXPLAIN output:

SET VARIABLE api_token = getenv('API_TOKEN');

SELECT http_post(
  'https://api.example.com/comments',
  headers => MAP{
    'authorization': 'Bearer ' || :api_token,
    'content-type':  'application/json'
  },
  params  => MAP{'body': 'Hello from DuckDB'}
);

Exchange OAuth credentials for an access token

The classic application/x-www-form-urlencoded handshake β€” followed by a JSON parse on the response body to lift the access token out:

SET VARIABLE client_id     = getenv('OAUTH_CLIENT_ID');
SET VARIABLE client_secret = getenv('OAUTH_CLIENT_SECRET');

WITH t AS (
  SELECT http_post_form(
    'https://oauth.example.com/token',
    headers => MAP{},
    params  => MAP{
      'grant_type':    'client_credentials',
      'client_id':     :client_id,
      'client_secret': :client_secret
    }
  ) AS res
)
SELECT ((res->>'body')::JSON)->>'$.access_token' AS access_token
FROM t;

Per-row enrichment from a REST API

SELECT
  u.id,
  u.email,
  ((http_get('https://api.example.com/score/' || u.id::VARCHAR)->>'body')::JSON)->>'$.score' AS score
FROM users u
WHERE u.last_seen > now() - INTERVAL 1 HOUR
LIMIT 100;

Practical for tens or hundreds of rows. For thousands, pre-aggregate to distinct keys first or cache the result into a DuckDB table β€” one HTTP call per row adds up fast.

Cache responses into a table

Hit the API once, query the cache forever after:

CREATE TABLE api_scores AS
SELECT
  u.id,
  ((http_get('https://api.example.com/score/' || u.id::VARCHAR)->>'body')::JSON)->>'$.score' AS score
FROM (SELECT DISTINCT id FROM users) u;

Re-run this only when you want a refresh. Subsequent joins against api_scores are local and fast.

Conditional logic on the status code

Branch in SQL based on the HTTP status β€” use CASE on the status field:

WITH r AS (
  SELECT http_get('https://api.example.com/things/42') AS res
)
SELECT
  CASE (res->>'status')::INT
    WHEN 200 THEN 'ok'
    WHEN 404 THEN 'missing'
    WHEN 429 THEN 'rate-limited'
    ELSE       'unexpected'
  END AS outcome,
  res->>'body' AS body
FROM r;

Liveness check with HEAD

SELECT (http_head('https://httpbin.org/')->>'status')::INT AS status;

http_head returns the same response metadata as http_get but the server skips the body. Useful for β€œis this URL reachable?” probes before a heavier GET.

Trigger a webhook from a SQL pipeline

End a pipeline by POSTing a summary to a webhook β€” synchronous, so the webhook either succeeds or the surrounding statement sees the error:

SET VARIABLE hook = getenv('WEBHOOK_URL');

SELECT http_post(
  :hook,
  headers => MAP{'content-type': 'application/json'},
  params  => MAP{
    'event':     'pipeline_complete',
    'rows':      (SELECT count(*)::VARCHAR FROM staging),
    'finished':  now()::VARCHAR
  }
);

Platform Support

Compatibility

Extension availability may vary by platform and DuckDB version. Check below to ensure this extension supports your environment before installation.

Quick Facts

Software License MIT
Pricing Free
Written In C++
Source Available Yes
View on GitHub
Usage
56,562+
loads in last 30 days

Platforms

Linux
Linux (musl)
macOS
Windows
WASM
Supported platform architectures
Linux: aarch64, x86_64
Linux (musl): x86_64
macOS: Intel, Apple Silicon
Windows: x86_64
WASM: eh, threads, mvp
Compiled binary sizes
Platform Architecture Size
Linux aarch64 5.85 MB
Linux x86_64 5.94 MB
Linux (musl) x86_64 5.12 MB
macOS Apple Silicon 4.24 MB
macOS Intel 4.28 MB
Windows x86_64 9.46 MB
WASM eh 64.2 KB
WASM mvp 57.9 KB
WASM threads 58.0 KB

Gzipped download size from the DuckDB community-extensions registry.

DuckDB Versions

Release calendar
Supported
v1.4.4 v1.5.2

HTTP calls from a SELECT

Install HTTP Client to issue GET / POST requests inside DuckDB queries β€” handy for joining a database column against a REST API.