🧩

JSONata

Evaluate [JSONata](https://jsonata.org/) expressions against JSON values from inside DuckDB SQL. Best fit: reshaping nested JSON into a new shape, predicate filters on arrays, and concise traversals that get verbose in chains of `json_extract`. Implemented as a single scalar function β€” no `ATTACH` interface, no secrets.

Install

-- Install the extension
INSTALL jsonata FROM community;

-- Load it into your session
LOAD jsonata;

-- Extract a nested field
SELECT jsonata('Account.Name',
               '{"Account":{"Name":"Firefly"}}');

-- Filter an array with a predicate, then project
SELECT jsonata('Phone[type="mobile"].number', payload)
FROM contacts;

-- Reshape JSON in one expression
SELECT jsonata('{
  "name":   FirstName & " " & Surname,
  "mobile": Phone[type="mobile"].number
}', payload) AS reshaped
FROM contacts;

Technical Overview

Why Use JSONata from DuckDB?

JSONata is an open query and transformation language for JSON, modelled on XPath. This extension exposes a single scalar function β€” jsonata β€” that evaluates JSONata expressions against DuckDB JSON values. Reach for it when DuckDB's native json_extract chain gets verbose: array predicates, object reshaping, and conditional projections collapse into one expression.

πŸ“• What this extension is for

DuckDB's built-in JSON functions are great for one-field-one-path access. JSONata's strength is everything beyond that β€” filtering arrays by predicate, building a new object shape, traversing wildcards, and chaining transforms in a single string.

  • β€’ Concise nested paths: Walk dotted paths like Account.Order.Product.Price directly. No quoting per segment, no intermediate json_extract calls.
  • β€’ Predicate filters on arrays: Phone[type="mobile"].number selects matching elements and projects in one step. See JSONata predicates.
  • β€’ Object reshaping: Object construction lets you build a fresh JSON object literal β€” pull only the fields you need, rename them, compute new ones β€” inside a single jsonata call.
  • β€’ Higher-order array operations: $map, $filter, $reduce, $sort and friends operate on arrays inside the JSON value. Transformations that would need unnest + subquery in plain SQL stay inline.

πŸ”Œ How it works

The extension is a pure scalar UDF β€” no ATTACH, no secrets, no network. jsonata(expression, json_data) parses the expression, evaluates it against the supplied JSON value, and returns a JSON result. A 3-arg form passes a bindings object so the expression can reference $variable names.

  • β€’ One function, two arities: jsonata(expr, json) for the common case; jsonata(expr, json, bindings) when the expression needs external $variables β€” typically thresholds, ids, or per-row context lifted from other columns.
  • β€’ Constant-expression optimization: When the expression argument is a constant SQL literal β€” the usual case β€” the parser runs once and the compiled program is reused for every row. Per-row cost is JSON walk + apply, not parse-and-compile.
  • β€’ JSON in, JSON out: The return type is always JSON. Cast or unnest downstream β€” result::VARCHAR, result->>'$', or unnest(result::JSON[]) for arrays β€” exactly like any other DuckDB JSON expression.
  • β€’ Try expressions interactively: The official JSONata exerciser is the fastest way to iterate on an expression before pasting it into a SQL string. Same evaluator semantics.

πŸ›‘οΈ Scope and caveats

The extension wraps the JSONata language faithfully but lives inside SQL β€” the surrounding context is DuckDB, not Node.js. A few things are worth knowing up front.

  • β€’ Single function surface: Everything is reached through jsonata. There are no helper UDFs for individual JSONata operators β€” the expression string carries the whole computation.
  • β€’ Expressions are SQL strings: JSONata expressions live inside a SQL VARCHAR literal. Embedded double quotes need SQL escaping; for long expressions, store them in a CTE or a SQL variable and reference by name.
  • β€’ No DuckDB function callbacks: JSONata's JavaScript-binding feature isn't wired up β€” you can't register a DuckDB UDF as a callable inside the expression. Stick to JSONata's built-in function library.
  • β€’ Performance scales with expression and document size: Deeply-nested documents and complex expressions both cost. For very wide rows, materialize the JSONata projection once into a column instead of re-evaluating it in every downstream query.

🎯 Common Use Cases

Reshape an API payload into table columns

One jsonata call with an object-construction expression replaces a stack of json_extract_string projections β€” and stays readable when nested fields move.

Filter and project nested arrays

items[status="open" and amount > 100].id returns only the matching ids. The same logic in DuckDB native JSON functions needs unnest plus a subquery.

Aggregate inside a JSON document

$sum, $count, $max operate on arrays-in-JSON without leaving the expression β€” useful for per-row roll-ups of an embedded line-item list.

Parameterize an expression at query time

Use the 3-arg jsonata(expr, json, bindings) form to pass thresholds, user ids, or date cutoffs through $variables so a single expression handles many runtime values.

Deep Dive

Technical Details

What you can do with one query

The headline pattern β€” reshape a nested JSON column into a clean output object in a single SQL projection:

SELECT jsonata('{
  "name":     FirstName & " " & Surname,
  "mobile":   Phone[type="mobile"].number,
  "lifetime": $sum(Orders.(Quantity * UnitPrice))
}', payload) AS profile
FROM contacts;

One jsonata call replaces a stack of json_extract_string projections plus an unnest subquery for the order roll-up. The expression β€” array filter, object construction, and an aggregation function β€” stays in one string.

Best fit, scoped honestly

This extension exposes JSONata as a single scalar SQL function β€” jsonata, in two arities. There is no ATTACH interface, no secrets, no I/O. The function operates on JSON values DuckDB already has in memory.

It is the right tool when DuckDB’s native JSON functions get verbose β€” array predicate filters, object reshaping, multi-step transforms. It is not a JSON loader (use DuckDB’s read_json for that) and it does not register custom DuckDB UDFs as callables inside the JSONata expression.

Architecture

The extension is a Rust-implemented scalar UDF. Each call to jsonata takes a JSONata expression string and a JSON value, parses the expression into an AST, evaluates it against the document, and returns a JSON result. There is no network, no extension state between rows, and no DuckDB type beyond the standard JSON type.

When the expression argument is a SQL constant β€” by far the common case β€” the parser runs once and the compiled program is reused for every row. Per-row cost is dominated by the JSON walk plus operator application; deeply-nested documents and large arrays scale linearly with the data they touch.

The 3-arg form, jsonata(expr, json, bindings), takes a JSON object whose keys become $variable names inside the expression. This is how you pass per-query values β€” thresholds, ids, dates β€” into an otherwise static expression so the parse cache still applies.

When to reach for JSONata vs DuckDB native JSON

DuckDB ships a rich JSON function library β€” json_extract, json_extract_string, the ->/->> operators, json_structure, and a JSON data type. For one-field-one-path access these are the right call β€” vectorized, idiomatic SQL, no extra extension.

JSONata is the better fit when the path includes any of:

  • Predicate filters on arrays. Phone[type="mobile"].number selects matching elements and projects in one step. The native equivalent is unnest plus a subquery with a WHERE clause.
  • Object reshaping. Construction expressions build a fresh JSON object literal β€” pull, rename, and compute fields together. The native equivalent is one json_extract_string per output field.
  • Embedded aggregation. $sum, $count, $max operate on arrays inside the JSON document. The native equivalent leaves the JSON value, unnests, aggregates, and returns.
  • Higher-order array operations. $map, $filter, $reduce keep transforms inline. The native equivalent is a lateral subquery.

Concrete contrast β€” selecting the mobile number from a contact’s Phone array:

-- DuckDB native: extract the array, unnest, filter, project
SELECT (
  SELECT json_extract_string(p, '$.number')
  FROM   unnest(json_extract(payload, '$.Phone')::JSON[]) AS t(p)
  WHERE  json_extract_string(p, '$.type') = 'mobile'
  LIMIT  1
) AS mobile
FROM contacts;

-- JSONata: one expression
SELECT jsonata('Phone[type="mobile"].number', payload) AS mobile
FROM contacts;

For a single field at a fixed path, native is simpler. The break-even is roughly β€œany predicate, any reshape, or more than two output fields.”

Compared to alternatives

  • DuckDB native JSON (->>, json_extract) β€” the right choice for simple field-by-field access. Stays simpler and avoids the extension entirely. Use JSONata when the expression starts to fan out across array predicates or object construction.
  • jq outside DuckDB β€” same general role for JSON, different language. jq is a separate process β€” file in, file out. JSONata-in-DuckDB keeps the transform inside the SQL plan, vectorized, joinable, and EXPLAIN-able alongside the rest of the query.
  • Application-side JSONata libraries (JS, Python, etc.) β€” the right call when the surrounding logic already lives in the app. Bringing JSONata into DuckDB matters when the JSON sits in a column and the rest of the pipeline is SQL β€” no row-format conversion, no result shuttling.

Iterating on expressions

JSONata expressions live inside SQL string literals, which makes them awkward to edit in place. The official JSONata exerciser takes a sample document and gives live results β€” the fastest loop for any non-trivial expression. The evaluator there matches what the extension runs, so once an expression is right in the exerciser it transfers unchanged into the SQL string.

API Reference

Function Documentation

Detailed documentation for each function including signatures, parameters, and examples.

jsonata

Scalar Function
Signature
jsonata(expression: VARCHAR, json_data: JSON, bindings: JSON) β†’ JSON
Parameters (Positional)
Parameter Type Mode Description
expression VARCHAR Positional A JSONata expression. Typically a SQL literal β€” when it's constant the extension parses it once and reuses the compiled program across every row.
json_data JSON Positional The JSON value the expression is evaluated against. Any DuckDB JSON value β€” column, literal, or function result.
bindings JSON Positional Optional. JSON object whose keys become $variable names accessible inside the expression. Use it to parameterize a static expression with per-query values (thresholds, ids, dates).
Returns

Result of evaluating the expression β€” typed as JSON. Cast (::VARCHAR, ::DOUBLE, ::JSON[]) or unnest downstream depending on what the expression returns.

Description

Evaluates a JSONata expression against a JSON document and returns the result as JSON.

JSONata is a lightweight query and transformation language for JSON, modelled on XPath. The expression can be a path traversal, an array predicate filter, an object-construction literal that reshapes the input, or any combination β€” all in one string.

Two arities:

  • jsonata(expression, json_data) β€” the common form.
  • jsonata(expression, json_data, bindings) β€” passes a bindings JSON object whose keys become $variable names inside the expression. Useful when the expression is a SQL constant but the parameters are per-query (e.g. a threshold lifted from a SQL variable).

When expression is a SQL constant β€” the typical case β€” it's parsed once and the compiled program is reused per row. Per-row cost is JSON walk + apply.

The full JSONata language reference, including string functions, numeric functions, aggregation, higher-order functions, sorting and grouping, and programming constructs, applies. The interactive JSONata exerciser is the fastest way to iterate on an expression before pasting it into SQL.

Examples
1

Extract a nested field

SELECT jsonata('Account.Name',
               '{"Account":{"Name":"Firefly"}}');
-- "Firefly"
2

Concatenate fields with the JSONata `&` operator

SELECT jsonata('FirstName & " " & Surname',
               '{"FirstName":"Fred","Surname":"Smith"}');
-- "Fred Smith"
3

Filter an array with a predicate, then project

SELECT jsonata('Phone[type="mobile"].number',
  '{"Phone":[
    {"type":"home","number":"0203 544 1234"},
    {"type":"mobile","number":"077 7700 1234"}
  ]}');
-- "077 7700 1234"
4

Reshape JSON into a new object in one expression

SELECT jsonata('{
  "name":   FirstName & " " & Surname,
  "mobile": Phone[type="mobile"].number
}', payload) AS reshaped
FROM contacts;
5

Aggregate inside a JSON document

SELECT jsonata('$sum(Order.Product.Price)', orders);
6

Parameterize with the 3-arg bindings form

SELECT jsonata('items[price > $threshold].id',
               payload,
               '{"threshold": 100}')
FROM line_items;

Practical Examples

Cookbook

Real-world recipes and patterns for common use cases.

Extract a nested field

The simplest form β€” a dotted path expression replaces a chain of json_extract calls:

SELECT jsonata('Account.Name',
               '{"Account":{"Name":"Firefly"}}');
-- "Firefly"

SELECT jsonata('Order.Product.Description.Colour', payload)
FROM orders;

See jsonata for the function signature.

Concatenate or compute on the way out

JSONata’s & operator concatenates strings; arithmetic operators work on numbers:

SELECT jsonata('FirstName & " " & Surname',
               '{"FirstName":"Fred","Surname":"Smith"}');
-- "Fred Smith"

SELECT jsonata('Quantity * UnitPrice',
               '{"Quantity":3,"UnitPrice":12.50}');
-- 37.5

Filter an array with a predicate

The predicate-filter syntax is JSONata’s headline feature β€” select matching array elements, then project, in one expression:

SELECT jsonata('Phone[type="mobile"].number', payload) AS mobile
FROM contacts;

SELECT jsonata('items[status="open" and amount > 100].id', payload)
FROM invoices;

In native DuckDB JSON, the equivalent typically needs unnest plus a subquery.

Reshape into a new object

Object construction builds a fresh JSON object literal β€” pick fields, rename them, compute new ones β€” inline:

SELECT jsonata('{
  "name":   FirstName & " " & Surname,
  "mobile": Phone[type="mobile"].number,
  "tier":   Account.Tier
}', payload) AS reshaped
FROM contacts;

The result is a JSON object with exactly the fields you wanted β€” useful for building API response shapes or normalizing third-party payloads before storage.

Aggregate inside a JSON document

Use JSONata aggregation functions to roll up arrays-in-JSON without leaving the expression:

SELECT
  order_id,
  jsonata('$sum(Order.Product.(Price * Quantity))', payload)::DOUBLE AS total,
  jsonata('$count(Order.Product)', payload)::INTEGER                 AS line_items
FROM orders;

Map, filter, reduce arrays

Higher-order functions let you transform arrays element-wise:

-- Square every price
SELECT jsonata('$map(prices, function($v) { $v * $v })',
               '{"prices":[1,2,3,4]}');

-- Keep only the high-value rows
SELECT jsonata('$filter(items, function($v) { $v.amount > 100 })', payload)
FROM invoices;

Parameterize an expression at query time

The 3-arg jsonata(expr, json, bindings) form supplies external $variables:

SET VARIABLE threshold = 100;

SELECT jsonata(
  'items[price > $threshold].{ "id": id, "price": price }',
  payload,
  json_object('threshold', getvariable('threshold'))
) AS expensive_items
FROM line_items;

The expression stays a static string (so it parses once); the per-query value rides in via bindings.

Unnest a JSONata array result into rows

When a JSONata expression returns an array, combine it with DuckDB’s unnest:

SELECT contact_id, phone
FROM (
  SELECT
    contact_id,
    unnest(jsonata('Phone.number', payload)::JSON[]) AS phone
  FROM contacts
);

Iterate before pasting

For anything more than a one-line path, prototype in the official JSONata exerciser β€” paste your sample document, iterate on the expression with live results, then bring it into SQL once it’s right. Same evaluator semantics as the extension.

Platform Support

Compatibility

Extension availability may vary by platform and DuckDB version. Check below to ensure this extension supports your environment before installation.

Quick Facts

Software License MIT
Pricing Free
Written In Rust
Source Available Yes
View on GitHub
Usage
13,486+
loads in last 30 days

Platforms

Linux
Linux (musl)
macOS
Windows
WASM
Supported platform architectures
Linux: x86_64, aarch64
Linux (musl): x86_64
macOS: Intel, Apple Silicon
Windows: x86_64
WASM: eh, mvp, threads
Compiled binary sizes
Platform Architecture Size
Linux aarch64 3.17 MB
Linux x86_64 3.58 MB
Linux (musl) x86_64 2.86 MB
macOS Apple Silicon 2.24 MB
macOS Intel 2.57 MB
Windows x86_64 7.88 MB
WASM eh 38.3 KB
WASM mvp 34.0 KB
WASM threads 34.0 KB

Gzipped download size from the DuckDB community-extensions registry.

DuckDB Versions

Release calendar
Supported
v1.4.4 v1.5.2

JSONata expressions inside SQL

Install the JSONata extension to evaluate JSONata expressions against JSON columns β€” for path traversal, conditional reshaping, and array transforms that are awkward in plain SQL.