๐Ÿ—บ๏ธ

GeoSilo

Compact geometry storage for DuckDB. A first-class GEOSILO column type that delta-encodes coordinates as scaled integers and pairs with ZSTD for ~21% smaller geometry storage than DuckDB's GEOMETRY (and ~70% smaller raw blobs against WKB). Native ST_* overloads for bounding-box, area, length, and introspection skip the WKB decode entirely.

Install

-- Install the extension
INSTALL geosilo FROM community;

-- Load it into your session
LOAD geosilo;

-- spatial first, then geosilo (geosilo overlays spatial's ST_* surface)
INSTALL spatial; LOAD spatial;
INSTALL geosilo FROM community; LOAD geosilo;

-- A GEOSILO column with ZSTD โ€” the recommended setup
CREATE TABLE parcels (
  id    INTEGER,
  name  VARCHAR,
  geom  GEOSILO('EPSG:4326') USING COMPRESSION zstd
);

-- Insert from any GEOMETRY source โ€” implicit cast auto-encodes
INSERT INTO parcels SELECT id, name, geom FROM raw_parcels;

-- ST_* runs directly on the GEOSILO column; native overloads skip the decode
SELECT id, ST_Area(geom), ST_X(ST_Centroid(geom)) FROM parcels;

Technical Overview

Why Use GeoSilo?

DuckDB's spatial extension stores geometry as WKB โ€” float64 coordinate pairs at 16 bytes each, even though adjacent vertices in a polygon share nearly all their bits. GeoSilo replaces that with a delta-encoded integer layout and a first-class GEOSILO column type. Most existing SQL keeps working because GeoSilo overlays the same ST_* surface; the introspection / bounding-box / measurement calls get native overloads that skip the WKB decode entirely.

๐Ÿ“ฆ What this extension is for

Compact geometry storage with a transparent ST_* surface. The GEOSILO type is a drop-in replacement for GEOMETRY columns in the cases GeoSilo targets โ€” large static or slowly-changing polygon / point datasets where on-disk size and bounding-box / area / length speed matter more than complex spatial operations.

  • โ€ข Smaller on disk: On the TIGER/Line 2025 US Census dataset, GEOSILO + ZSTD is ~21% smaller than DuckDB's GEOMETRY column (and ~70% smaller as raw blobs vs WKB). The win comes from delta-encoded int16 deltas compressing exceptionally well under ZSTD.
  • โ€ข Native ST_* fast paths: ST_GeometryType, ST_IsEmpty, ST_NPoints, ST_X / ST_Y, ST_XMin / ST_XMax / ST_YMin / ST_YMax, ST_Area, ST_Length, ST_Perimeter all have GEOSILO-native implementations. They read the silo header or walk the integer delta stream directly โ€” no GEOS round-trip.
  • โ€ข Transparent fallback: Any ST_* function without a native GEOSILO overload โ€” ST_Buffer, ST_Union, ST_AsText, etc. โ€” still works. The implicit GEOSILO โ†’ GEOMETRY cast runs geosilo_decode on demand.
  • โ€ข Arrow IPC interchange: GeoSilo registers an Arrow extension type named queryfarm.geosilo. Producers tag a binary column with that metadata; the DuckDB consumer auto-decodes silo blobs to GEOMETRY on read and re-encodes on write. CRS rides through the metadata so scale stays consistent end-to-end.

๐Ÿงฎ How the encoding works

GeoSilo's encoding is straightforward and verifiable from the README. No ML, no learned codec โ€” just delta encoding plus integer scaling.

  • โ€ข Coordinates as scaled integers: Each (x, y) pair is multiplied by an integer scale and stored as integers. For EPSG:4326 (degrees), the default scale is 1e7 โ€” about 1 cm precision. For projected CRSes like UTM or EPSG:3857 (meters), the default is 100, also 1 cm.
  • โ€ข Delta encoding within rings: The first vertex of each ring is stored as a 4-byte int32. Every subsequent vertex is a 2-byte int16 delta from the previous one. Roughly 90% of deltas fit in 2 bytes; the remainder fall back to a 6-byte escape sequence.
  • โ€ข Tight headers, no closing vertex: Points use a 1-byte compact header. Polygon rings omit the closing duplicate vertex (it's reconstructed on decode). The header carries geometry_type, vertex_type, and scale, all readable by geosilo_metadata without touching the body.
  • โ€ข Bounding-box and area on integers: Native ST_XMin/XMax/YMin/YMax walk the int16 delta stream and accumulate; native ST_Area runs the shoelace formula on int64 coordinates rebuilt from the same stream. Both avoid the GEOS dependency surface for these calls.

๐Ÿ›ก๏ธ Honest scope

GeoSilo is a storage and fast-path extension, not a replacement for GEOS-backed spatial operations.

  • โ€ข Native overloads are limited: The native fast path covers introspection, bounding-box accessors, and area / length / perimeter. Operations like ST_Buffer, ST_Union, ST_Intersection, ST_AsText, and most of the wider ST_* surface still auto-decode to GEOMETRY and use spatial's GEOS-backed implementation.
  • โ€ข Lossy at the chosen scale: Coordinates round to the integer scale you pick. The defaults give ~1 cm precision for the common CRSes; finer-grained data needs an explicit geosilo_encode(geom, scale). Round-tripping through GeoSilo and back is not bit-exact with the original float64 WKB.
  • โ€ข ZSTD does the heavy lifting: Without ZSTD, raw GEOSILO blobs are actually larger than DuckDB's columnar GEOMETRY storage. USING COMPRESSION zstd is what makes the storage win real โ€” use it.
  • โ€ข Spatial extension required: GeoSilo overlays spatial's ST_* namespace. Load spatial first, then geosilo, in every connection that needs the GEOSILO type.

๐ŸŽฏ Common Use Cases

Shrink a static polygon dataset

TIGER/Line, OSM extracts, parcels, admin boundaries โ€” geometry that's read often and rewritten rarely. Define the column as GEOSILO USING COMPRESSION zstd and the same SQL keeps working at a fraction of the storage cost.

Native bounding-box prefilter

Filter on ST_XMin / ST_XMax / ST_YMin / ST_YMax before any GEOS-backed operation. The native overloads run on integer deltas โ€” fast enough that the prefilter pays for itself even in single-table scans.

Compact geometry over the wire

Use the queryfarm.geosilo Arrow extension type to ship encoded geometry between services. Smaller payloads, transparent decode at the DuckDB consumer.

Fast measurement scans

Per-row ST_Area / ST_Length / ST_Perimeter on millions of polygons without the WKB decode loop. The shoelace runs on int64; ZSTD keeps the data small enough that I/O isn't the bottleneck either.

Deep Dive

Technical Details

What you can do with one column type

The single most useful pattern: take an existing GEOMETRY column and re-declare it as GEOSILO with ZSTD compression. Every existing ST_* query still works โ€” and the introspection / bounding-box / measurement calls get faster:

INSTALL spatial; LOAD spatial;
INSTALL geosilo FROM community; LOAD geosilo;

-- Compact geometry, transparent ST_* surface
CREATE TABLE parcels (
  id    INTEGER,
  name  VARCHAR,
  geom  GEOSILO('EPSG:4326') USING COMPRESSION zstd
);

INSERT INTO parcels SELECT id, name, geom FROM raw_parcels;

-- Bounding-box prefilter โ€” entirely native, no WKB decode
SELECT id, ST_Area(geom)
FROM parcels
WHERE ST_XMin(geom) > -78
  AND ST_XMax(geom) < -76;

The GEOSILO type delta-encodes coordinates as scaled integers โ€” each subsequent vertex of a ring is a 2-byte int16 offset from the previous one. On the TIGER/Line 2025 US Census dataset, GEOSILO + ZSTD is ~21% smaller than DuckDBโ€™s columnar GEOMETRY storage.

Storage and a few fast paths โ€” not a GEOS replacement

GeoSilo is a compact storage format with a handful of native ST_* overloads. It is not a re-implementation of GEOS. The native fast path covers exactly: ST_GeometryType, ST_IsEmpty, ST_NPoints, ST_X / ST_Y, ST_XMin / ST_XMax / ST_YMin / ST_YMax, ST_Area, ST_Length, ST_Perimeter. Everything else โ€” ST_Buffer, ST_Union, ST_Intersection, ST_AsText, and the rest of the spatial extension surface โ€” works via the implicit GEOSILO โ†’ GEOMETRY cast (which runs geosilo_decode under the hood).

The encoding is also lossy at the chosen integer scale. Defaults give ~1 cm precision; finer data needs an explicit geosilo_encode(geom, scale).

For unbounded GEOS-backed analytics, use spatial directly. For storage-heavy workflows where bounding-box filtering and area / length scans dominate, GeoSilo is the right tool.

Setup

GeoSilo overlays spatialโ€™s ST_* namespace, so load both โ€” spatial first:

INSTALL spatial; LOAD spatial;
INSTALL geosilo FROM community; LOAD geosilo;

The GEOSILO column type

Declare a column as GEOSILO, optionally tagging the CRS. With a CRS the integer scale is auto-detected (degrees โ†’ 1e7, meters โ†’ 100); without one, the default is 1e7:

-- With CRS โ€” scale auto-detected
CREATE TABLE parcels (
  id    INTEGER,
  name  VARCHAR,
  geom  GEOSILO('EPSG:4326') USING COMPRESSION zstd
);

-- Without CRS โ€” default scale 1e7
CREATE TABLE shapes (geom GEOSILO USING COMPRESSION zstd);

USING COMPRESSION zstd is the recommended setting โ€” itโ€™s where the storage win comes from. See CREATE TABLE for the full DDL.

The implicit GEOMETRY โ†’ GEOSILO cast auto-encodes on insert, so you can populate from any GEOMETRY source โ€” Parquet, CSV-then-ST_GeomFromText, another DuckDB table:

INSERT INTO parcels SELECT id, geom FROM raw_data;

Whatโ€™s native, what auto-decodes

Standard spatial functions all work on GEOSILO columns. The ones below have native GEOSILO overloads that read the silo header or walk the integer delta stream โ€” no WKB round-trip:

GroupNative overloads
IntrospectionST_GeometryType, ST_IsEmpty, ST_NPoints, ST_X, ST_Y
Bounding boxST_XMin, ST_XMax, ST_YMin, ST_YMax
MeasurementST_Area, ST_Length, ST_Perimeter

Everything else (ST_Buffer, ST_Union, ST_Intersection, ST_AsText, ST_Centroid, etc.) auto-decodes via the implicit cast and runs through the GEOS-backed spatial implementation.

Encoding format, briefly

The format is verifiable and small enough to summarize:

  1. Coordinates scale to integers (-75.509491 ร— 1e7 = -755094910).
  2. The first vertex of each ring is a 4-byte int32 absolute coordinate.
  3. Subsequent vertices are stored as 2-byte int16 deltas from the previous; ~90% fit. The rest use a 6-byte escape sequence.
  4. Polygon rings omit the closing duplicate vertex โ€” itโ€™s reconstructed on decode.
  5. Points use a 1-byte compact header.

The header (geometry_type, vertex_type, scale) is readable without decoding the body โ€” see geosilo_metadata.

Scale and precision

CRS familyUnitsDefault scalePrecision
EPSG:4326 / NAD83degrees10,000,000~1 cm
UTM / EPSG:3857meters1001 cm

Auto-applied when you declare GEOSILO('EPSG:...'). Override per-call with the two-argument form:

SELECT geosilo_encode(geom, 1000) FROM millimeter_data;

Arrow IPC transport

GeoSilo registers an Arrow extension type named queryfarm.geosilo. A producer tags an Arrow binary field with the right metadata and the DuckDB consumer (with GeoSilo loaded) auto-decodes silo blobs to GEOMETRY on read and re-encodes on write โ€” CRS rides through the fieldโ€™s metadata:

geom_field = pa.field("geom", pa.binary(), metadata={
  b"ARROW:extension:name":     b"queryfarm.geosilo",
  b"ARROW:extension:metadata": b'{"crs":"EPSG:4326"}',
})

Sibling extensions in the geo bucket

  • a5 โ€” pentagonal global geospatial index for spatial aggregation, joins, and equal-area binning. Pairs well: index with a5, store the underlying geometry with geosilo.
  • lindel โ€” Hilbert / Morton space-filling curves for ordering multi-dimensional data on disk. Sort a GEOSILO table by a lindel-encoded centroid to keep nearby geometries on the same Parquet row group / DuckDB block.

Reference

Extension Contents

Quick reference to all available functions and settings organized by category.

Name Description
Encoding

Encode/decode between [GEOMETRY](https://duckdb.org/docs/extensions/spatial) and the compact GEOSILO BLOB format, plus header inspection without a full decode. The GEOSILO column type drives most of this transparently โ€” these functions are for explicit BLOB pipelines and Arrow IPC interchange.

geosilo_decode() Decode a GEOSILO blob back into a standard GEOMETRY
geosilo_encode() Encode a GEOMETRY into the compact GEOSILO format and return a BLOB
geosilo_metadata() Read a GEOSILO blob's header without decoding the geometry
Introspection (native)

ST_* overloads with native GEOSILO implementations โ€” type, emptiness, point count, X/Y, and the four bounding-box accessors. They read the silo header / delta stream directly without round-tripping through WKB, which is where the speed-up on bounding-box prefilters and per-row coordinate access comes from.

ST_GeometryType() Native GEOSILO overload โ€” returns the geometry type (e
ST_IsEmpty() Native GEOSILO overload โ€” TRUE if the encoded geometry has no vertices
ST_NPoints() Native GEOSILO overload โ€” total vertex count, read from the silo header
ST_X() Native GEOSILO overload โ€” X coordinate of a single-point geometry, decoded from the silo header without materializing a GEOMETRY
ST_XMax() Native GEOSILO overload โ€” maximum X across the geometry
ST_XMin() Native GEOSILO overload โ€” minimum X across the geometry, computed by walking the integer delta stream
ST_Y() Native GEOSILO overload โ€” Y coordinate of a single-point geometry, decoded from the silo header without materializing a GEOMETRY
ST_YMax() Native GEOSILO overload โ€” maximum Y across the geometry
ST_YMin() Native GEOSILO overload โ€” minimum Y across the geometry
Measurement (native)

ST_* overloads for area, length, and perimeter computed directly on the integer delta stream (shoelace on int64 for area). No WKB decode, no GEOS dependency surface for these calls.

ST_Area() Native GEOSILO overload โ€” polygon area computed via the shoelace formula directly on int64 coordinates rebuilt from the delta stream
ST_Length() Native GEOSILO overload โ€” total length of linear geometries, computed on the integer delta stream without rehydrating a GEOMETRY
ST_Perimeter() Native GEOSILO overload โ€” perimeter of polygonal geometries, computed on the integer delta stream without rehydrating a GEOMETRY

API Reference

Function Documentation

Detailed documentation for each function including signatures, parameters, and examples.

geosilo_decode

Scalar Function Encoding
Signature
geosilo_decode(col0: BLOB | GEOSILO) โ†’ GEOMETRY
Parameters (Positional)
Parameter Type Mode Description
col0 BLOB | GEOSILO
2 concrete types
BLOBGEOSILO
Positional
Returns
Description

Decode a GEOSILO blob back into a standard GEOMETRY. Most callers don't need to call this directly โ€” the implicit GEOSILO โ†’ GEOMETRY cast runs it on demand for any ST_* function that doesn't have a native GEOSILO overload (e.g. ST_Buffer, ST_Union, ST_AsText).

Examples
1

Round-trip a polygon through the encoding

SELECT ST_AsText(
  geosilo_decode(geosilo_encode(ST_GeomFromText('POLYGON((0 0,10 0,10 10,0 10,0 0))')))
);

geosilo_encode

Scalar Function Encoding
Signature
geosilo_encode(col0: GEOMETRY) โ†’ BLOB
Parameters (Positional)
Parameter Type Mode Description
col0 GEOMETRY Positional
Returns
Description

Encode a GEOMETRY into the compact GEOSILO format and return a BLOB. With one argument, the integer scale is derived from the geometry's CRS โ€” degrees default to 1e7 (โ‰ˆ1 cm), projected meters to 100 (1 cm). With two arguments, the second explicitly sets the scale.

Examples
1

Default scale (auto from CRS, or `1e7` if absent)

SELECT geosilo_encode(ST_GeomFromText('POINT(1 2)'));
2

Explicit scale โ€” UTM / Web Mercator data, 1 cm precision

SELECT geosilo_encode(geom, 100) FROM utm_data;

geosilo_metadata

Scalar Function Encoding
Signature
geosilo_metadata(col0: GEOSILO | BLOB) โ†’ STRUCT(geometry_type VARCHAR, vertex_type VARCHAR, scale BIGINT)
Parameters (Positional)
Parameter Type Mode Description
col0 GEOSILO | BLOB
2 concrete types
BLOBGEOSILO
Positional
Returns
Description

Read a GEOSILO blob's header without decoding the geometry. Returns STRUCT(geometry_type VARCHAR, vertex_type VARCHAR, scale BIGINT). Useful for filtering by geometry type or auditing the integer scale of a stored column before any heavy spatial work.

Examples
1

Inspect the header of stored silo blobs

SELECT geosilo_metadata(silo_blob) AS meta
FROM compact_table
LIMIT 5;
-- {'geometry_type':'POLYGON','vertex_type':'XY','scale':10000000}

ST_Area

Scalar Function Measurement (native)
Signature
ST_Area(col0: GEOSILO) โ†’ DOUBLE
Parameters (Positional)
Parameter Type Mode Description
col0 GEOSILO Positional
Returns
Description

Native GEOSILO overload โ€” polygon area computed via the shoelace formula directly on int64 coordinates rebuilt from the delta stream. Skips the WKB decode entirely.

Examples

ST_GeometryType

Scalar Function Introspection (native)
Signature
ST_GeometryType(col0: GEOSILO) โ†’ VARCHAR
Parameters (Positional)
Parameter Type Mode Description
col0 GEOSILO Positional
Returns
Description

Native GEOSILO overload โ€” returns the geometry type (e.g. POINT, POLYGON) by reading a single byte of the silo header. No WKB decode.

Examples

ST_IsEmpty

Scalar Function Introspection (native)
Signature
ST_IsEmpty(col0: GEOSILO) โ†’ BOOLEAN
Parameters (Positional)
Parameter Type Mode Description
col0 GEOSILO Positional
Returns
Description

Native GEOSILO overload โ€” TRUE if the encoded geometry has no vertices.

Examples

ST_Length

Scalar Function Measurement (native)
Signature
ST_Length(col0: GEOSILO) โ†’ DOUBLE
Parameters (Positional)
Parameter Type Mode Description
col0 GEOSILO Positional
Returns
Description

Native GEOSILO overload โ€” total length of linear geometries, computed on the integer delta stream without rehydrating a GEOMETRY.

Examples

ST_NPoints

Scalar Function Introspection (native)
Signature
ST_NPoints(col0: GEOSILO) โ†’ INTEGER
Parameters (Positional)
Parameter Type Mode Description
col0 GEOSILO Positional
Returns
Description

Native GEOSILO overload โ€” total vertex count, read from the silo header.

Examples

ST_Perimeter

Scalar Function Measurement (native)
Signature
ST_Perimeter(col0: GEOSILO) โ†’ DOUBLE
Parameters (Positional)
Parameter Type Mode Description
col0 GEOSILO Positional
Returns
Description

Native GEOSILO overload โ€” perimeter of polygonal geometries, computed on the integer delta stream without rehydrating a GEOMETRY.

Examples

ST_X

Scalar Function Introspection (native)
Signature
ST_X(col0: GEOSILO) โ†’ DOUBLE
Parameters (Positional)
Parameter Type Mode Description
col0 GEOSILO Positional
Returns
Description

Native GEOSILO overload โ€” X coordinate of a single-point geometry, decoded from the silo header without materializing a GEOMETRY.

Examples

ST_XMax

Scalar Function Introspection (native)
Signature
ST_XMax(col0: GEOSILO) โ†’ DOUBLE
Parameters (Positional)
Parameter Type Mode Description
col0 GEOSILO Positional
Returns
Description

Native GEOSILO overload โ€” maximum X across the geometry. Walks the integer delta stream; no WKB decode.

Examples

ST_XMin

Scalar Function Introspection (native)
Signature
ST_XMin(col0: GEOSILO) โ†’ DOUBLE
Parameters (Positional)
Parameter Type Mode Description
col0 GEOSILO Positional
Returns
Description

Native GEOSILO overload โ€” minimum X across the geometry, computed by walking the integer delta stream. Pair with ST_XMax / ST_YMin / ST_YMax for native bounding-box prefilters.

Examples

ST_Y

Scalar Function Introspection (native)
Signature
ST_Y(col0: GEOSILO) โ†’ DOUBLE
Parameters (Positional)
Parameter Type Mode Description
col0 GEOSILO Positional
Returns
Description

Native GEOSILO overload โ€” Y coordinate of a single-point geometry, decoded from the silo header without materializing a GEOMETRY.

Examples

ST_YMax

Scalar Function Introspection (native)
Signature
ST_YMax(col0: GEOSILO) โ†’ DOUBLE
Parameters (Positional)
Parameter Type Mode Description
col0 GEOSILO Positional
Returns
Description

Native GEOSILO overload โ€” maximum Y across the geometry. Walks the integer delta stream; no WKB decode.

Examples

ST_YMin

Scalar Function Introspection (native)
Signature
ST_YMin(col0: GEOSILO) โ†’ DOUBLE
Parameters (Positional)
Parameter Type Mode Description
col0 GEOSILO Positional
Returns
Description

Native GEOSILO overload โ€” minimum Y across the geometry. Walks the integer delta stream; no WKB decode.

Examples

Practical Examples

Cookbook

Real-world recipes and patterns for common use cases.

Setup

Load spatial first, then geosilo:

INSTALL spatial; LOAD spatial;
INSTALL geosilo FROM community; LOAD geosilo;

The GEOSILO column type and the native ST_* overloads only register once both are loaded โ€” geosilo overlays spatialโ€™s ST_* namespace.

Define a compact geometry table

The recommended shape โ€” explicit CRS, ZSTD compression:

CREATE TABLE parcels (
  id    INTEGER,
  name  VARCHAR,
  geom  GEOSILO('EPSG:4326') USING COMPRESSION zstd
);

The CRS lets GeoSilo pick a sensible integer scale automatically (1e7 for degrees, 100 for projected meters). USING COMPRESSION zstd is where the storage win comes from โ€” without it, raw GEOSILO blobs are larger than DuckDBโ€™s columnar GEOMETRY storage. See CREATE TABLE for the DDL surface.

Load from any GEOMETRY source

The implicit GEOMETRY โ†’ GEOSILO cast auto-encodes on insert:

-- From another DuckDB table
INSERT INTO parcels
SELECT id, name, geom FROM raw_parcels;

-- From WKT in CSV
INSERT INTO parcels
SELECT id, name, ST_GeomFromText(wkt) FROM read_csv('parcels.csv');

-- From Parquet with a WKB column
INSERT INTO parcels
SELECT id, name, ST_GeomFromWKB(geom_wkb)
FROM read_parquet('parcels.parquet');

Sources can be any expression returning GEOMETRY โ€” see spatialโ€™s constructor functions for the full list.

Native bounding-box prefilter

The four ST_XMin / ST_XMax / ST_YMin / ST_YMax calls have GEOSILO-native implementations โ€” they walk the integer delta stream without rebuilding a GEOMETRY. Use them in WHERE clauses to prefilter before any GEOS-backed work:

-- Everything in a longitude band โ€” entirely native
SELECT id, name
FROM parcels
WHERE ST_XMin(geom) > -78
  AND ST_XMax(geom) < -76;

-- Hand off to GEOS only for the survivors of the prefilter
SELECT id, ST_AsText(ST_Centroid(geom))
FROM parcels
WHERE ST_XMin(geom) > -78
  AND ST_XMax(geom) < -76
  AND ST_YMin(geom) >  39
  AND ST_YMax(geom) <  41;

See ST_XMin / ST_XMax / ST_YMin / ST_YMax.

Native area, length, perimeter

ST_Area on a GEOSILO column runs the shoelace formula on int64 coordinates rebuilt from the delta stream โ€” no WKB decode:

-- Per-feature area
SELECT id, ST_Area(geom) AS area
FROM parcels
ORDER BY area DESC
LIMIT 10;

-- Total length of a road network
SELECT SUM(ST_Length(geom)) AS total_road_length
FROM roads;

-- Perimeter histogram
SELECT FLOOR(LOG(ST_Perimeter(geom))) AS bucket, COUNT(*) AS n
FROM parcels
GROUP BY bucket
ORDER BY bucket;

See ST_Area, ST_Length, ST_Perimeter.

Per-point coordinate access

ST_X / ST_Y on point geometries decode straight from the silo header:

SELECT id,
       ST_X(geom) AS lon,
       ST_Y(geom) AS lat
FROM stations;

For non-point geometries, use ST_Centroid (which auto-decodes to GEOMETRY) and then ST_X / ST_Y on the resulting point.

Filter by geometry type without decoding

geosilo_metadata reads the silo header directly. Combine with ST_GeometryType (which is also native) to filter cheaply by type:

-- Counts by geometry type โ€” no WKB decode per row
SELECT ST_GeometryType(geom) AS gtype, COUNT(*) AS n
FROM mixed_geom_table
GROUP BY gtype
ORDER BY n DESC;

-- Inspect the integer scale of stored blobs
SELECT geosilo_metadata(silo_blob) AS meta
FROM compact_table
LIMIT 5;
-- {'geometry_type':'POLYGON','vertex_type':'XY','scale':10000000}

Manual encode / decode for BLOB pipelines

When you control the BLOB pipeline directly โ€” exporting to a file format, shipping over a non-Arrow protocol, staging โ€” use geosilo_encode and geosilo_decode:

-- Encode for export
COPY (
  SELECT id, geosilo_encode(geom) AS silo
  FROM parcels_geometry
) TO 'parcels_silo.parquet' (FORMAT 'parquet');

-- Decode on read
SELECT id, ST_AsText(geosilo_decode(silo)) AS wkt
FROM read_parquet('parcels_silo.parquet')
LIMIT 5;

For a custom integer scale (e.g. millimeter-precision projected data), pass the second argument:

SELECT geosilo_encode(geom, 1000) FROM mm_precision_data;

Mix native fast paths with GEOS-backed operations

Anything outside the native overload set auto-decodes through the implicit GEOSILO โ†’ GEOMETRY cast. You donโ€™t need to think about it โ€” but if youโ€™re scanning many rows, push the native operations first to shrink the candidate set:

-- ST_Buffer auto-decodes; ST_XMin/XMax/YMin/YMax run native first
SELECT id, ST_AsText(ST_Buffer(geom, 0.001)) AS buffered_wkt
FROM parcels
WHERE ST_XMin(geom) > -78
  AND ST_XMax(geom) < -76
LIMIT 100;

See the spatial extension for the GEOS-backed surface that auto-decodes.

Pair with sibling geo extensions

GeoSilo handles compact storage; pair it with the other geo bucket extensions for indexing and ordering:

-- a5: pentagonal geospatial index for aggregation / joins
INSTALL a5 FROM community; LOAD a5;

-- Bin parcels by A5 cell at resolution 12
SELECT a5_lonlat_to_cell(ST_X(ST_Centroid(geom)),
                         ST_Y(ST_Centroid(geom)),
                         12) AS cell,
       COUNT(*) AS parcels_in_cell
FROM parcels
GROUP BY cell
ORDER BY parcels_in_cell DESC;

See a5 for the indexing surface, and lindel for space-filling curve ordering โ€” sort a GEOSILO table by a Hilbert-encoded centroid to keep nearby geometries on the same DuckDB block / Parquet row group.

Platform Support

Compatibility

Extension availability may vary by platform and DuckDB version. Check below to ensure this extension supports your environment before installation.

Quick Facts

Software License MIT
Pricing Free
Written In C++
Source Available Yes
View on GitHub
Usage
5,214+
loads in last 30 days

Platforms

Linux
Linux (musl)
macOS
Windows
WASM
Supported platform architectures
Linux: aarch64, x86_64
Linux (musl): Not available
macOS: Apple Silicon, Intel
Windows: x86_64
WASM: threads, eh, mvp
Compiled binary sizes
Platform Architecture Size
Linux aarch64 2.70 MB
Linux x86_64 3.07 MB
macOS Apple Silicon 2.31 MB
macOS Intel 2.65 MB
Windows x86_64 7.53 MB
WASM eh 28.3 KB
WASM mvp 26.7 KB
WASM threads 26.7 KB

Gzipped download size from the DuckDB community-extensions registry.

DuckDB Versions

Release calendar
Supported
v1.5.2

Compact geometry, native fast paths

Install GeoSilo for smaller-on-disk geometry storage and faster ST_* bounding-box / area / length operations โ€” overlay it on the spatial extension and most existing SQL keeps working.