GeoSilo
Compact geometry storage for DuckDB. A first-class GEOSILO column type that delta-encodes coordinates as scaled integers and pairs with ZSTD for ~21% smaller geometry storage than DuckDB's GEOMETRY (and ~70% smaller raw blobs against WKB). Native ST_* overloads for bounding-box, area, length, and introspection skip the WKB decode entirely.
Install
-- Install the extension
INSTALL geosilo FROM community;
-- Load it into your session
LOAD geosilo;
-- spatial first, then geosilo (geosilo overlays spatial's ST_* surface)
INSTALL spatial; LOAD spatial;
INSTALL geosilo FROM community; LOAD geosilo;
-- A GEOSILO column with ZSTD โ the recommended setup
CREATE TABLE parcels (
id INTEGER,
name VARCHAR,
geom GEOSILO('EPSG:4326') USING COMPRESSION zstd
);
-- Insert from any GEOMETRY source โ implicit cast auto-encodes
INSERT INTO parcels SELECT id, name, geom FROM raw_parcels;
-- ST_* runs directly on the GEOSILO column; native overloads skip the decode
SELECT id, ST_Area(geom), ST_X(ST_Centroid(geom)) FROM parcels; Technical Overview
Why Use GeoSilo?
DuckDB's spatial extension stores geometry as WKB โ float64 coordinate pairs at 16 bytes each, even though adjacent vertices in a polygon share nearly all their bits. GeoSilo replaces that with a delta-encoded integer layout and a first-class GEOSILO column type. Most existing SQL keeps working because GeoSilo overlays the same ST_* surface; the introspection / bounding-box / measurement calls get native overloads that skip the WKB decode entirely.
๐ฆ What this extension is for
Compact geometry storage with a transparent ST_* surface. The GEOSILO type is a drop-in replacement for GEOMETRY columns in the cases GeoSilo targets โ large static or slowly-changing polygon / point datasets where on-disk size and bounding-box / area / length speed matter more than complex spatial operations.
- โข Smaller on disk: On the TIGER/Line 2025 US Census dataset,
GEOSILO + ZSTDis ~21% smaller than DuckDB'sGEOMETRYcolumn (and ~70% smaller as raw blobs vs WKB). The win comes from delta-encoded int16 deltas compressing exceptionally well underZSTD. - โข Native ST_* fast paths:
ST_GeometryType,ST_IsEmpty,ST_NPoints,ST_X/ST_Y,ST_XMin/ST_XMax/ST_YMin/ST_YMax,ST_Area,ST_Length,ST_Perimeterall have GEOSILO-native implementations. They read the silo header or walk the integer delta stream directly โ no GEOS round-trip. - โข Transparent fallback: Any
ST_*function without a native GEOSILO overload โST_Buffer,ST_Union,ST_AsText, etc. โ still works. The implicitGEOSILO โ GEOMETRYcast runsgeosilo_decodeon demand. - โข Arrow IPC interchange: GeoSilo registers an Arrow extension type named
queryfarm.geosilo. Producers tag a binary column with that metadata; the DuckDB consumer auto-decodes silo blobs toGEOMETRYon read and re-encodes on write. CRS rides through the metadata so scale stays consistent end-to-end.
๐งฎ How the encoding works
GeoSilo's encoding is straightforward and verifiable from the README. No ML, no learned codec โ just delta encoding plus integer scaling.
- โข Coordinates as scaled integers: Each
(x, y)pair is multiplied by an integer scale and stored as integers. ForEPSG:4326(degrees), the default scale is1e7โ about 1 cm precision. For projected CRSes like UTM orEPSG:3857(meters), the default is100, also 1 cm. - โข Delta encoding within rings: The first vertex of each ring is stored as a 4-byte int32. Every subsequent vertex is a 2-byte int16 delta from the previous one. Roughly 90% of deltas fit in 2 bytes; the remainder fall back to a 6-byte escape sequence.
- โข Tight headers, no closing vertex: Points use a 1-byte compact header. Polygon rings omit the closing duplicate vertex (it's reconstructed on decode). The header carries
geometry_type,vertex_type, andscale, all readable bygeosilo_metadatawithout touching the body. - โข Bounding-box and area on integers: Native
ST_XMin/XMax/YMin/YMaxwalk the int16 delta stream and accumulate; nativeST_Arearuns the shoelace formula on int64 coordinates rebuilt from the same stream. Both avoid the GEOS dependency surface for these calls.
๐ก๏ธ Honest scope
GeoSilo is a storage and fast-path extension, not a replacement for GEOS-backed spatial operations.
- โข Native overloads are limited: The native fast path covers introspection, bounding-box accessors, and area / length / perimeter. Operations like
ST_Buffer,ST_Union,ST_Intersection,ST_AsText, and most of the widerST_*surface still auto-decode toGEOMETRYand usespatial's GEOS-backed implementation. - โข Lossy at the chosen scale: Coordinates round to the integer scale you pick. The defaults give ~1 cm precision for the common CRSes; finer-grained data needs an explicit
geosilo_encode(geom, scale). Round-tripping through GeoSilo and back is not bit-exact with the original float64 WKB. - โข ZSTD does the heavy lifting: Without
ZSTD, raw GEOSILO blobs are actually larger than DuckDB's columnarGEOMETRYstorage.USING COMPRESSION zstdis what makes the storage win real โ use it. - โข Spatial extension required: GeoSilo overlays
spatial'sST_*namespace. Loadspatialfirst, thengeosilo, in every connection that needs the GEOSILO type.
๐ฏ Common Use Cases
Shrink a static polygon dataset
TIGER/Line, OSM extracts, parcels, admin boundaries โ geometry that's read often and rewritten rarely. Define the column as GEOSILO USING COMPRESSION zstd and the same SQL keeps working at a fraction of the storage cost.
Native bounding-box prefilter
Filter on ST_XMin / ST_XMax / ST_YMin / ST_YMax before any GEOS-backed operation. The native overloads run on integer deltas โ fast enough that the prefilter pays for itself even in single-table scans.
Compact geometry over the wire
Use the queryfarm.geosilo Arrow extension type to ship encoded geometry between services. Smaller payloads, transparent decode at the DuckDB consumer.
Fast measurement scans
Per-row ST_Area / ST_Length / ST_Perimeter on millions of polygons without the WKB decode loop. The shoelace runs on int64; ZSTD keeps the data small enough that I/O isn't the bottleneck either.
Deep Dive
Technical Details
What you can do with one column type
The single most useful pattern: take an existing GEOMETRY column and re-declare it as GEOSILO with ZSTD compression. Every existing ST_* query still works โ and the introspection / bounding-box / measurement calls get faster:
INSTALL spatial; LOAD spatial;
INSTALL geosilo FROM community; LOAD geosilo;
-- Compact geometry, transparent ST_* surface
CREATE TABLE parcels (
id INTEGER,
name VARCHAR,
geom GEOSILO('EPSG:4326') USING COMPRESSION zstd
);
INSERT INTO parcels SELECT id, name, geom FROM raw_parcels;
-- Bounding-box prefilter โ entirely native, no WKB decode
SELECT id, ST_Area(geom)
FROM parcels
WHERE ST_XMin(geom) > -78
AND ST_XMax(geom) < -76;
The GEOSILO type delta-encodes coordinates as scaled integers โ each subsequent vertex of a ring is a 2-byte int16 offset from the previous one. On the TIGER/Line 2025 US Census dataset, GEOSILO + ZSTD is ~21% smaller than DuckDBโs columnar GEOMETRY storage.
GeoSilo is a compact storage format with a handful of native ST_* overloads. It is not a re-implementation of GEOS. The native fast path covers exactly: ST_GeometryType, ST_IsEmpty, ST_NPoints, ST_X / ST_Y, ST_XMin / ST_XMax / ST_YMin / ST_YMax, ST_Area, ST_Length, ST_Perimeter. Everything else โ ST_Buffer, ST_Union, ST_Intersection, ST_AsText, and the rest of the spatial extension surface โ works via the implicit GEOSILO โ GEOMETRY cast (which runs geosilo_decode under the hood).
The encoding is also lossy at the chosen integer scale. Defaults give ~1 cm precision; finer data needs an explicit geosilo_encode(geom, scale).
For unbounded GEOS-backed analytics, use spatial directly. For storage-heavy workflows where bounding-box filtering and area / length scans dominate, GeoSilo is the right tool.
Setup
GeoSilo overlays spatialโs ST_* namespace, so load both โ spatial first:
INSTALL spatial; LOAD spatial;
INSTALL geosilo FROM community; LOAD geosilo;
The GEOSILO column type
Declare a column as GEOSILO, optionally tagging the CRS. With a CRS the integer scale is auto-detected (degrees โ 1e7, meters โ 100); without one, the default is 1e7:
-- With CRS โ scale auto-detected
CREATE TABLE parcels (
id INTEGER,
name VARCHAR,
geom GEOSILO('EPSG:4326') USING COMPRESSION zstd
);
-- Without CRS โ default scale 1e7
CREATE TABLE shapes (geom GEOSILO USING COMPRESSION zstd);
USING COMPRESSION zstd is the recommended setting โ itโs where the storage win comes from. See CREATE TABLE for the full DDL.
The implicit GEOMETRY โ GEOSILO cast auto-encodes on insert, so you can populate from any GEOMETRY source โ Parquet, CSV-then-ST_GeomFromText, another DuckDB table:
INSERT INTO parcels SELECT id, geom FROM raw_data;
Whatโs native, what auto-decodes
Standard spatial functions all work on GEOSILO columns. The ones below have native GEOSILO overloads that read the silo header or walk the integer delta stream โ no WKB round-trip:
| Group | Native overloads |
|---|---|
| Introspection | ST_GeometryType, ST_IsEmpty, ST_NPoints, ST_X, ST_Y |
| Bounding box | ST_XMin, ST_XMax, ST_YMin, ST_YMax |
| Measurement | ST_Area, ST_Length, ST_Perimeter |
Everything else (ST_Buffer, ST_Union, ST_Intersection, ST_AsText, ST_Centroid, etc.) auto-decodes via the implicit cast and runs through the GEOS-backed spatial implementation.
Encoding format, briefly
The format is verifiable and small enough to summarize:
- Coordinates scale to integers (
-75.509491 ร 1e7 = -755094910). - The first vertex of each ring is a 4-byte int32 absolute coordinate.
- Subsequent vertices are stored as 2-byte int16 deltas from the previous; ~90% fit. The rest use a 6-byte escape sequence.
- Polygon rings omit the closing duplicate vertex โ itโs reconstructed on decode.
- Points use a 1-byte compact header.
The header (geometry_type, vertex_type, scale) is readable without decoding the body โ see geosilo_metadata.
Scale and precision
| CRS family | Units | Default scale | Precision |
|---|---|---|---|
EPSG:4326 / NAD83 | degrees | 10,000,000 | ~1 cm |
UTM / EPSG:3857 | meters | 100 | 1 cm |
Auto-applied when you declare GEOSILO('EPSG:...'). Override per-call with the two-argument form:
SELECT geosilo_encode(geom, 1000) FROM millimeter_data;
Arrow IPC transport
GeoSilo registers an Arrow extension type named queryfarm.geosilo. A producer tags an Arrow binary field with the right metadata and the DuckDB consumer (with GeoSilo loaded) auto-decodes silo blobs to GEOMETRY on read and re-encodes on write โ CRS rides through the fieldโs metadata:
geom_field = pa.field("geom", pa.binary(), metadata={
b"ARROW:extension:name": b"queryfarm.geosilo",
b"ARROW:extension:metadata": b'{"crs":"EPSG:4326"}',
})
Sibling extensions in the geo bucket
a5โ pentagonal global geospatial index for spatial aggregation, joins, and equal-area binning. Pairs well: index witha5, store the underlying geometry withgeosilo.lindelโ Hilbert / Morton space-filling curves for ordering multi-dimensional data on disk. Sort aGEOSILOtable by alindel-encoded centroid to keep nearby geometries on the same Parquet row group / DuckDB block.
Reference
Extension Contents
Quick reference to all available functions and settings organized by category.
| Name | Description | |
|---|---|---|
| Encoding Encode/decode between [ | ||
| geosilo_decode() | Decode a GEOSILO blob back into a standard GEOMETRY | |
| geosilo_encode() | Encode a GEOMETRY into the compact GEOSILO format and return a BLOB | |
| geosilo_metadata() | Read a GEOSILO blob's header without decoding the geometry | |
| Introspection (native) ST_* overloads with native GEOSILO implementations โ type, emptiness, point count, X/Y, and the four bounding-box accessors. They read the silo header / delta stream directly without round-tripping through WKB, which is where the speed-up on bounding-box prefilters and per-row coordinate access comes from. | ||
| ST_GeometryType() | Native GEOSILO overload โ returns the geometry type (e | |
| ST_IsEmpty() | Native GEOSILO overload โ TRUE if the encoded geometry has no vertices | |
| ST_NPoints() | Native GEOSILO overload โ total vertex count, read from the silo header | |
| ST_X() | Native GEOSILO overload โ X coordinate of a single-point geometry, decoded from the silo header without materializing a GEOMETRY | |
| ST_XMax() | Native GEOSILO overload โ maximum X across the geometry | |
| ST_XMin() | Native GEOSILO overload โ minimum X across the geometry, computed by walking the integer delta stream | |
| ST_Y() | Native GEOSILO overload โ Y coordinate of a single-point geometry, decoded from the silo header without materializing a GEOMETRY | |
| ST_YMax() | Native GEOSILO overload โ maximum Y across the geometry | |
| ST_YMin() | Native GEOSILO overload โ minimum Y across the geometry | |
| Measurement (native) ST_* overloads for area, length, and perimeter computed directly on the integer delta stream (shoelace on int64 for area). No WKB decode, no GEOS dependency surface for these calls. | ||
| ST_Area() | Native GEOSILO overload โ polygon area computed via the shoelace formula directly on int64 coordinates rebuilt from the delta stream | |
| ST_Length() | Native GEOSILO overload โ total length of linear geometries, computed on the integer delta stream without rehydrating a GEOMETRY | |
| ST_Perimeter() | Native GEOSILO overload โ perimeter of polygonal geometries, computed on the integer delta stream without rehydrating a GEOMETRY | |
API Reference
Function Documentation
Detailed documentation for each function including signatures, parameters, and examples.
geosilo_decode
Signature
Parameters (Positional)
| Parameter | Type | Mode | Description |
|---|---|---|---|
col0 | BLOB | GEOSILO 2 concrete typesBLOBGEOSILO | Positional |
Returns
Description
Decode a GEOSILO blob back into a standard GEOMETRY. Most callers don't need to call this directly โ the implicit GEOSILO โ GEOMETRY cast runs it on demand for any ST_* function that doesn't have a native GEOSILO overload (e.g. ST_Buffer, ST_Union, ST_AsText).
Examples
Round-trip a polygon through the encoding
SELECT ST_AsText(
geosilo_decode(geosilo_encode(ST_GeomFromText('POLYGON((0 0,10 0,10 10,0 10,0 0))')))
); Related Functions
geosilo_encode
Signature
Parameters (Positional)
| Parameter | Type | Mode | Description |
|---|---|---|---|
col0 | GEOMETRY | Positional |
Returns
Description
Encode a GEOMETRY into the compact GEOSILO format and return a BLOB. With one argument, the integer scale is derived from the geometry's CRS โ degrees default to 1e7 (โ1 cm), projected meters to 100 (1 cm). With two arguments, the second explicitly sets the scale.
Examples
Default scale (auto from CRS, or `1e7` if absent)
SELECT geosilo_encode(ST_GeomFromText('POINT(1 2)')); Explicit scale โ UTM / Web Mercator data, 1 cm precision
SELECT geosilo_encode(geom, 100) FROM utm_data; Related Functions
geosilo_metadata
Signature
Parameters (Positional)
| Parameter | Type | Mode | Description |
|---|---|---|---|
col0 | GEOSILO | BLOB 2 concrete typesBLOBGEOSILO | Positional |
Returns
Description
Read a GEOSILO blob's header without decoding the geometry. Returns STRUCT(geometry_type VARCHAR, vertex_type VARCHAR, scale BIGINT). Useful for filtering by geometry type or auditing the integer scale of a stored column before any heavy spatial work.
Examples
Inspect the header of stored silo blobs
SELECT geosilo_metadata(silo_blob) AS meta
FROM compact_table
LIMIT 5;
-- {'geometry_type':'POLYGON','vertex_type':'XY','scale':10000000} Related Functions
ST_Area
Signature
Parameters (Positional)
| Parameter | Type | Mode | Description |
|---|---|---|---|
col0 | GEOSILO | Positional |
Returns
Description
Native GEOSILO overload โ polygon area computed via the shoelace formula directly on int64 coordinates rebuilt from the delta stream. Skips the WKB decode entirely.
Examples
Related Functions
- ST_Perimeter() โ Native GEOSILO overload โ perimeter of polygonal geometries, computed on the integer delta stream without rehydrating a `GEOMETRY`
- ST_Length() โ Native GEOSILO overload โ total length of linear geometries, computed on the integer delta stream without rehydrating a `GEOMETRY`
ST_GeometryType
Signature
Parameters (Positional)
| Parameter | Type | Mode | Description |
|---|---|---|---|
col0 | GEOSILO | Positional |
Returns
Description
Native GEOSILO overload โ returns the geometry type (e.g. POINT, POLYGON) by reading a single byte of the silo header. No WKB decode.
Examples
Related Functions
ST_IsEmpty
Signature
Parameters (Positional)
| Parameter | Type | Mode | Description |
|---|---|---|---|
col0 | GEOSILO | Positional |
Returns
Description
Native GEOSILO overload โ TRUE if the encoded geometry has no vertices.
Examples
Related Functions
ST_Length
Signature
Parameters (Positional)
| Parameter | Type | Mode | Description |
|---|---|---|---|
col0 | GEOSILO | Positional |
Returns
Description
Native GEOSILO overload โ total length of linear geometries, computed on the integer delta stream without rehydrating a GEOMETRY.
Examples
Related Functions
- ST_Perimeter() โ Native GEOSILO overload โ perimeter of polygonal geometries, computed on the integer delta stream without rehydrating a `GEOMETRY`
- ST_Area() โ Native GEOSILO overload โ polygon area computed via the shoelace formula directly on int64 coordinates rebuilt from the delta stream
ST_NPoints
Signature
Parameters (Positional)
| Parameter | Type | Mode | Description |
|---|---|---|---|
col0 | GEOSILO | Positional |
Returns
Description
Native GEOSILO overload โ total vertex count, read from the silo header.
Examples
Related Functions
ST_Perimeter
Signature
Parameters (Positional)
| Parameter | Type | Mode | Description |
|---|---|---|---|
col0 | GEOSILO | Positional |
Returns
Description
Native GEOSILO overload โ perimeter of polygonal geometries, computed on the integer delta stream without rehydrating a GEOMETRY.
Examples
Related Functions
ST_X
Signature
Parameters (Positional)
| Parameter | Type | Mode | Description |
|---|---|---|---|
col0 | GEOSILO | Positional |
Returns
Description
Native GEOSILO overload โ X coordinate of a single-point geometry, decoded from the silo header without materializing a GEOMETRY.
Examples
Related Functions
- ST_Y() โ Native GEOSILO overload โ Y coordinate of a single-point geometry, decoded from the silo header without materializing a `GEOMETRY`
- ST_XMin() โ Native GEOSILO overload โ minimum X across the geometry, computed by walking the integer delta stream
- ST_XMax() โ Native GEOSILO overload โ maximum X across the geometry
ST_XMax
Signature
Parameters (Positional)
| Parameter | Type | Mode | Description |
|---|---|---|---|
col0 | GEOSILO | Positional |
Returns
Description
Native GEOSILO overload โ maximum X across the geometry. Walks the integer delta stream; no WKB decode.
Examples
Related Functions
ST_XMin
Signature
Parameters (Positional)
| Parameter | Type | Mode | Description |
|---|---|---|---|
col0 | GEOSILO | Positional |
Returns
Description
Native GEOSILO overload โ minimum X across the geometry, computed by walking the integer delta stream. Pair with ST_XMax / ST_YMin / ST_YMax for native bounding-box prefilters.
Examples
Related Functions
ST_Y
Signature
Parameters (Positional)
| Parameter | Type | Mode | Description |
|---|---|---|---|
col0 | GEOSILO | Positional |
Returns
Description
Native GEOSILO overload โ Y coordinate of a single-point geometry, decoded from the silo header without materializing a GEOMETRY.
Examples
Related Functions
ST_YMax
Signature
Parameters (Positional)
| Parameter | Type | Mode | Description |
|---|---|---|---|
col0 | GEOSILO | Positional |
Returns
Description
Native GEOSILO overload โ maximum Y across the geometry. Walks the integer delta stream; no WKB decode.
Examples
Related Functions
ST_YMin
Signature
Parameters (Positional)
| Parameter | Type | Mode | Description |
|---|---|---|---|
col0 | GEOSILO | Positional |
Returns
Description
Native GEOSILO overload โ minimum Y across the geometry. Walks the integer delta stream; no WKB decode.
Examples
Related Functions
Practical Examples
Cookbook
Real-world recipes and patterns for common use cases.
Setup
Load spatial first, then geosilo:
INSTALL spatial; LOAD spatial;
INSTALL geosilo FROM community; LOAD geosilo;
The GEOSILO column type and the native ST_* overloads only register once both are loaded โ geosilo overlays spatialโs ST_* namespace.
Define a compact geometry table
The recommended shape โ explicit CRS, ZSTD compression:
CREATE TABLE parcels (
id INTEGER,
name VARCHAR,
geom GEOSILO('EPSG:4326') USING COMPRESSION zstd
);
The CRS lets GeoSilo pick a sensible integer scale automatically (1e7 for degrees, 100 for projected meters). USING COMPRESSION zstd is where the storage win comes from โ without it, raw GEOSILO blobs are larger than DuckDBโs columnar GEOMETRY storage. See CREATE TABLE for the DDL surface.
Load from any GEOMETRY source
The implicit GEOMETRY โ GEOSILO cast auto-encodes on insert:
-- From another DuckDB table
INSERT INTO parcels
SELECT id, name, geom FROM raw_parcels;
-- From WKT in CSV
INSERT INTO parcels
SELECT id, name, ST_GeomFromText(wkt) FROM read_csv('parcels.csv');
-- From Parquet with a WKB column
INSERT INTO parcels
SELECT id, name, ST_GeomFromWKB(geom_wkb)
FROM read_parquet('parcels.parquet');
Sources can be any expression returning GEOMETRY โ see spatialโs constructor functions for the full list.
Native bounding-box prefilter
The four ST_XMin / ST_XMax / ST_YMin / ST_YMax calls have GEOSILO-native implementations โ they walk the integer delta stream without rebuilding a GEOMETRY. Use them in WHERE clauses to prefilter before any GEOS-backed work:
-- Everything in a longitude band โ entirely native
SELECT id, name
FROM parcels
WHERE ST_XMin(geom) > -78
AND ST_XMax(geom) < -76;
-- Hand off to GEOS only for the survivors of the prefilter
SELECT id, ST_AsText(ST_Centroid(geom))
FROM parcels
WHERE ST_XMin(geom) > -78
AND ST_XMax(geom) < -76
AND ST_YMin(geom) > 39
AND ST_YMax(geom) < 41;
See ST_XMin / ST_XMax / ST_YMin / ST_YMax.
Native area, length, perimeter
ST_Area on a GEOSILO column runs the shoelace formula on int64 coordinates rebuilt from the delta stream โ no WKB decode:
-- Per-feature area
SELECT id, ST_Area(geom) AS area
FROM parcels
ORDER BY area DESC
LIMIT 10;
-- Total length of a road network
SELECT SUM(ST_Length(geom)) AS total_road_length
FROM roads;
-- Perimeter histogram
SELECT FLOOR(LOG(ST_Perimeter(geom))) AS bucket, COUNT(*) AS n
FROM parcels
GROUP BY bucket
ORDER BY bucket;
See ST_Area, ST_Length, ST_Perimeter.
Per-point coordinate access
ST_X / ST_Y on point geometries decode straight from the silo header:
SELECT id,
ST_X(geom) AS lon,
ST_Y(geom) AS lat
FROM stations;
For non-point geometries, use ST_Centroid (which auto-decodes to GEOMETRY) and then ST_X / ST_Y on the resulting point.
Filter by geometry type without decoding
geosilo_metadata reads the silo header directly. Combine with ST_GeometryType (which is also native) to filter cheaply by type:
-- Counts by geometry type โ no WKB decode per row
SELECT ST_GeometryType(geom) AS gtype, COUNT(*) AS n
FROM mixed_geom_table
GROUP BY gtype
ORDER BY n DESC;
-- Inspect the integer scale of stored blobs
SELECT geosilo_metadata(silo_blob) AS meta
FROM compact_table
LIMIT 5;
-- {'geometry_type':'POLYGON','vertex_type':'XY','scale':10000000}
Manual encode / decode for BLOB pipelines
When you control the BLOB pipeline directly โ exporting to a file format, shipping over a non-Arrow protocol, staging โ use geosilo_encode and geosilo_decode:
-- Encode for export
COPY (
SELECT id, geosilo_encode(geom) AS silo
FROM parcels_geometry
) TO 'parcels_silo.parquet' (FORMAT 'parquet');
-- Decode on read
SELECT id, ST_AsText(geosilo_decode(silo)) AS wkt
FROM read_parquet('parcels_silo.parquet')
LIMIT 5;
For a custom integer scale (e.g. millimeter-precision projected data), pass the second argument:
SELECT geosilo_encode(geom, 1000) FROM mm_precision_data;
Mix native fast paths with GEOS-backed operations
Anything outside the native overload set auto-decodes through the implicit GEOSILO โ GEOMETRY cast. You donโt need to think about it โ but if youโre scanning many rows, push the native operations first to shrink the candidate set:
-- ST_Buffer auto-decodes; ST_XMin/XMax/YMin/YMax run native first
SELECT id, ST_AsText(ST_Buffer(geom, 0.001)) AS buffered_wkt
FROM parcels
WHERE ST_XMin(geom) > -78
AND ST_XMax(geom) < -76
LIMIT 100;
See the spatial extension for the GEOS-backed surface that auto-decodes.
Pair with sibling geo extensions
GeoSilo handles compact storage; pair it with the other geo bucket extensions for indexing and ordering:
-- a5: pentagonal geospatial index for aggregation / joins
INSTALL a5 FROM community; LOAD a5;
-- Bin parcels by A5 cell at resolution 12
SELECT a5_lonlat_to_cell(ST_X(ST_Centroid(geom)),
ST_Y(ST_Centroid(geom)),
12) AS cell,
COUNT(*) AS parcels_in_cell
FROM parcels
GROUP BY cell
ORDER BY parcels_in_cell DESC;
See a5 for the indexing surface, and lindel for space-filling curve ordering โ sort a GEOSILO table by a Hilbert-encoded centroid to keep nearby geometries on the same DuckDB block / Parquet row group.
Platform Support
Compatibility
Extension availability may vary by platform and DuckDB version. Check below to ensure this extension supports your environment before installation.
Quick Facts
Platforms
Supported platform architectures
Compiled binary sizes
| Platform | Architecture | Size |
|---|---|---|
| Linux | aarch64 | 2.70 MB |
| Linux | x86_64 | 3.07 MB |
| macOS | Apple Silicon | 2.31 MB |
| macOS | Intel | 2.65 MB |
| Windows | x86_64 | 7.53 MB |
| WASM | eh | 28.3 KB |
| WASM | mvp | 26.7 KB |
| WASM | threads | 26.7 KB |
Gzipped download size from the DuckDB community-extensions registry.
Compact geometry, native fast paths
Install GeoSilo for smaller-on-disk geometry storage and faster ST_* bounding-box / area / length operations โ overlay it on the spatial extension and most existing SQL keeps working.