Lightweight GIS pipelines with fio-planet


By: Sean Gillies on April 27 2023

Planet’s new CLI has no builtin GIS capabilities other than what Planet’s APIs provide. To help make more sophisticated workflows possible, Planet is releasing a new package of command line programs for manipulating streams of GeoJSON features.

Making Command Line GIS Pipelines with fio-planet

Planet’s new command line interface (CLI) has no builtin GIS capabilities other than what Planet’s APIs provide. This is by design; it is meant to be complemented by other tools. To help make more sophisticated workflows possible, Planet is releasing a new package of command line programs, fio-planet, which let you build Unix pipelines for manipulating streams of GeoJSON features. Feature simplification before searching the Data API is one such application.

Planet’s new command line interface (CLI) has three main jobs. One is to help you prepare API requests. Planet’s APIs and data products have many options. The CLI provides tools to make specifying what you want easy. Another purpose is to make the status of those requests visible. Some requests take a few minutes to be fulfilled. The CLI permits notification when they are done, like the following command line output.

planet orders wait 65df4eb0-e416-4243-a4d2-38afcf382c30 && cowsay "your order is ready for download"
__________________________________
< your order is ready for download >
 ----------------------------------
        \   ^__^
         \  (oo)\_______
            (__)\       )\/\
                ||----w |
                ||     ||

Or you may prefer to leave cows alone and just download the order when it is ready.

planet orders wait 65df4eb0-e416-4243-a4d2-38afcf382c30 && planet orders download 65df4eb0-e416-4243-a4d2-38afcf382c30

The third purpose of the CLI is to print API responses in a form that lets them be used as inputs to other CLI programs such as jq, the streaming JSON filter. CLI commands, generally speaking, print newline-delimited streams of JSON objects or GeoJSON features.

By design, the CLI has limited options for manipulating JSON and relies heavily on jq. The CLI docs have many examples of usage in combination with jq. Similarly, the CLI outsources GIS capabilities to other programs. One option is the inscrutable and venerable ogr2ogr, but it is not as handy with streaming JSON or GeoJSON as jq is.

To help make sophisticated command line geo-processing workflows more accessible, Planet is releasing a package of command line programs that let you build Unix pipelines for manipulating streams of GeoJSON features. Feature simplification before searching the Data API is one of the many applications.

Processing GeoJSON with fio and fio-planet

The Python package Fiona includes a CLI designed around streams of GeoJSON features. The fio-cat command streams GeoJSON features out of one or more vector datasets. The fio-load command performs the reverse operation, streaming features into GIS file formats. Two Shapefiles with the same schema can be combined into a GeoPackage file using the Unix pipeline below.

fio cat example1.shp example2.shp | fio load -f GPKG examples.gpkg

Note: pipeline examples in this blog post assume a POSIX shell such as bash or zsh. The vertical bar | is a “pipe”. It creates two processes and connects the standard output stream of one to the standard input stream of the other. GeoJSON features pass through this pipe from one program to the other.

Planet’s new fio-planet package fits into this conceptual framework and adds three commands to the fio toolbox: fio-filter, fio-map, and fio-reduce. These commands provide some of the features of spatial SQL, but act on features in a GeoJSON feature sequence instead of rows in a spatial table. Each command accepts a JSON sequence as input, evaluates an expression in the context of the sequence, and produces a new JSON sequence as output. These may be GeoJSON sequences.

  • fio-filter evaluates an expression for each feature in a stream of GeoJSON features, passing those for which the expression is true.
  • fio-map maps an expression over a stream of GeoJSON features, producing a stream of new features or other values
  • fio-reduce applies an expression to a sequence of GeoJSON features, reducing them to a single feature or other value.

In combination, many transformations are possible with these three commands.

Expressions take the form of parenthesized, comma-less lists which may contain other expressions. The first item in a list is the name of a function or method, or an expression that evaluates to a function. It’s a kind of prefix, or Polish notation. The second item is the function's first argument or the object to which the method is bound. The remaining list items are the positional and keyword arguments for the named function or method. The list of functions and callables available in an expression includes:

  • Python builtins such as dict, list, and map
  • From functools: reduce
  • All public functions from itertools, for example, islice, and repeat
  • All functions importable from Shapely 2.0, for example, Point, and unary_union
  • All methods of Shapely geometry classes

Let’s look at some examples. Below is an expression that evaluates to a Shapely Point instance. Point is a callable instance constructor and the pair of 0 values are positional arguments. Note that the outermost parentheses of an expression are optional.

(Point 0 0)

Fio-planet translates that to the following Python expression.

from shapely import Point
Point(0, 0)

Next is an expression which evaluates to a Polygon, using Shapely’s buffer function. The distance parameter is computed from its own expression, demonstrating prefix notation once again.

(buffer (Point 0 0) :distance (/ 5 2))

The equivalent in Python is

from shapely import buffer
buffer(Point(0, 0), distance=5 / 2)

Fio-filter and fio-map evaluate expressions in the context of a GeoJSON feature and its geometry attribute. These are named f and g. For example, here is an expression that tests whether the distance from the input feature to the point at 0 degrees North and 0 degrees E is less than or equal to one kilometer (1,000 meters).

(<= (distance g (Point 0 0)) 1000)

fio-reduce evaluates expressions in the context of the sequence of all input geometries, which is named c. For example, the expression below dissolves all input geometries using Shapely's unary_union function.

(unary_union c)

Fio-filter evaluates expressions to true or false. Fio-map and fio-reduce will attempt to wrap expression results as GeoJSON feature objects unless raw results are specified. For more information about expressions and usage, please see the fio-planet project page.

Why does fio-planet use Lisp-like expressions for code that is ultimately executed by a Python interpreter? Fio-planet expressions are intended to allow a small subset of what you can do in GeoPandas, PostGIS, QGIS, or a bespoke Python program. By design, fio-planet’s expressions should help discriminate between workflows that can execute effectively on the command line using GeoJSON and workflows that are better executed in a more powerful environment. A domain-specific language (DSL), instead of a general purpose language, is intended to clarify the situation. Fio-planet’s DSL doesn’t afford variable assignment or flow control, for example. That’s supposed to be a signal of its limits. As soon as you feel fio-planet expressions are holding you back, you are right. Use something else!

The program ogr2ogr uses a dialect of SQL as its DSL. The QGIS expression language is also akin to a subset of SQL. Those programs tend to see features as rows of a table. Fio-planet works with streams, not tables, and so a distinctly different kind of DSL seemed appropriate. It adopted the Lisp-like expression language of Rasterio. Parenthesized lists are very easy to parse and Lisp has a long history of use in DSLs. Hylang is worth a look if you’re curious about a more fully featured Lisp for the Python runtime.

Lisp Cycles
(https://xkcd.com/297/)

Simplifying shapes with fio-map and fio-reduce

Previously, the Planet Developers Blog published a deep dive into simplifying areas of interest for use with the Planet platform. The takeaway from that post is that it’s a good idea to simplify areas of interest as much as you can. Fio-planet brings methods for simplifying shapes and measuring the complexity of shapes to the command line alongside Planet’s CLI. Examples of using them in the context of the previous post are shown below. All the examples use a 25-feature shapefile. You can get it from rmnp.zip or access it in a streaming fashion as shown in the examples below.

Note: all examples assume a POSIX shell such as bash or zsh. Some use the program named jq. The vertical bar | is a “pipe”. It creates two processes and connects the standard output stream of one to the standard input stream of the other. The backward slash \ permits line continuation and allows pipelines to be typed in a more readable form.

A figure accompanies each example. The figures are rendered from a GeoJSON file by QGIS. The GeoJSON files are made by collecting, with fio-collect, the non-raw output of fio-cat, fio-map, or fio-reduce. For example, the pipeline below converts the zipped Shapefile on the web to a GeoJSON file on your computer.

fio cat zip+https://github.com/planetlabs/fio-planet/files/10045442/rmnp.zip \
| fio collect > rmnp.geojson

Counting vertices in a feature collection

The vertex_count function, in conjunction with fio-map's --raw option, prints out the number of vertices in each feature. The default for fio-map is to wrap the result of every evaluated expression in a GeoJSON feature; --raw disables this. The program jq provides a nice way of summing the resulting sequence of numbers. jq -s “slurps” a stream of JSON objects into an array.

The following pipeline prints the number 28,915.

Input:

fio cat zip+https://github.com/planetlabs/fio-planet/files/10045442/rmnp.zip \
| fio map 'vertex_count g' --raw \
| jq -s 'add'

Output:

28915
Zones
The 25 Wilderness Patrol Zones of Rocky Mountain National Park have 28,915 vertices.

Counting vertices after making a simplified buffer

One traditional way of simplifying an area of interest is to buffer it by some distance and then simplify it by a comparable distance. The effectiveness of this method depends on the nature of the data, especially the distance between vertices around the boundary of the area. There's no need to use jq in the following pipeline because fio-reduce prints out a sequence of exactly one value.

Input:

fio cat zip+https://github.com/planetlabs/fio-planet/files/10045442/rmnp.zip \
| fio reduce 'unary_union c' \
| fio map 'simplify (buffer g 40) 40' \
| fio map 'vertex_count g' --raw

Output:

469

Variable assignment and flow control are not provided by fio-planet’s expression language. It is the pipes between commands which afford some assignment and logic. For example, the pipeline above is practically equivalent to the following Python program.

import fiona
from fiona.transform import transform_geom
from shapely import buffer, shape, simplify, mapping, unary_union


with fiona.open("zip+https://github.com/planetlabs/fio-planet/files/10045442/rmnp.zip") as dataset:
    c = [shape(feat.geometry) for feat in dataset]


g = unary_union(c)


g = shape(transform_geom("OGC:CRS84", "EPSG:6933", mapping(g)))
g = simplify(buffer(g, 40), 40)
g = shape(transform_geom("EPSG:6933", "OGC:CRS84", mapping(g)))


print(vertex_count(g))  
Simplified
Zones merged, buffered, and simplified into one shape with 469 vertices.

Counting vertices after merging convex hulls of features

Convex hulls are an easy means of simplification. There are no distance parameters to tweak as there were in the example above. The --dump-parts option of fio-map turns the parts of multi-part features into separate single-part features. This is one of the ways in which fio-map can multiply its inputs, printing out more features than it receives.

The following pipeline prints the number 157.

Input:

fio cat zip+https://github.com/planetlabs/fio-planet/files/10045442/rmnp.zip \
| fio map 'convex_hull g' --dump-parts \
| fio reduce 'unary_union c' \
| fio map 'vertex_count g' --raw

Output:

157
Convex
The convex hulls of the zones, when merged, have 157 vertices.

Counting vertices after merging the concave hulls of features

Convex hulls simplify, but also dilate concave areas of interest. They fill the "bays", so to speak, and this can be undesirable. Concave hulls do a better job at preserving the concave nature of a shape and result in a smaller increase of area.

The following pipeline prints the number 301.

Input:

fio cat zip+https://github.com/planetlabs/fio-planet/files/10045442/rmnp.zip \
| fio map 'concave_hull g 0.4' --dump-parts \
| fio reduce 'unary_union c' \
| fio map 'vertex_count g' --raw

Output:

301
Concave
The concave hulls of the zones, when merged, have 301 vertices.

Using fio-planet with the Planet CLI

Now for more specific examples of using fio-planet with Planet’s CLI on the command line. If you want to spatially constrain a search for assets in Planet’s catalog or clip an order to an area of interest, you will need a GeoJSON object. The pipeline below creates one based on the data used above and saves it to a file. Note that the fourth piece of the pipeline forces the output GeoJSON to be two-dimensional. Planet’s Data API doesn’t accept GeoJSON with Z coordinates.

fio cat zip+https://github.com/planetlabs/fio-planet/files/10045442/rmnp.zip \
| fio map 'concave_hull g 0.4' --dump-parts \
| fio reduce 'unary_union c' \
| fio map 'force_2d g' \
| fio collect \
> rmnp.geojson

Incorporating that GeoJSON object into a Data API search filter is the next step. The filter produced by the command shown below will find items acquired after Feb 14, 2023 that intersect with the shape of Rocky Mountain National Park.

planet data filter \
--date-range acquired gt 2023-02-14 \
--date-range acquired lt 2023-02-25 \
--geom=rmnp.geojson \
> filter.json

With a filter document, stored here in filter.json – the name of the filter doesn’t matter – you can make a filtered search and stream the GeoJSON results into a file. It’s a common practice to use .geojsons, plural, as the file extension for newline-delimited sequences of GeoJSON features.

planet data search PSScene --filter=filter.json > results.geojsons

With a filter document, stored here in filter.json – the name of the filter doesn’t matter – you can make a filtered search and stream the GeoJSON results into a file. It’s a common practice to use .geojsons, plural, as the file extension for newline-delimited sequences of GeoJSON features.

planet data search PSScene --filter=filter.json > results.geojsons

The fio-planet commands are useful for processing search results, too. The results.geojsons file contains a stream of GeoJSON features which represent some PSScene items in Planet’s catalog. Here’s an example of finding the eight scenes that cover 40.255 degrees North and 105.615 degrees West (the summit of Longs Peak in Rocky Mountain National Park).

Input:

cat results.geojsons \
| fio filter 'contains g (Point -105.615 40.255)' \
| jq '.id'

Output:

"20230222_172633_18_247f"
"20230221_165059_39_241b"
"20230221_165057_22_241b"
"20230220_171805_27_2251"
"20230219_172658_28_2461"
"20230217_172940_13_2474"
"20230216_171328_72_2262"
"20230214_173045_31_2489"

To further filter by the degree to which the ground is visible in each scene – which could also have been specified in the search filter, by the way – we can add a jq step to the pipeline.

cat results.geojsons \
| fio filter 'contains g (Point -105.615 40.255)' \
| jq -c 'select(.properties.visible_percent > 90)' \
| fio collect \
> psscenes.geojson
Search
Three PlanetScope Scene footprints mapped with RMNP and Longs Peak.

Fio-filter and fio-map can also be used as checks on the number of vertices and the area of GeoJSON features. Let’s say that you want to keep your areas of interest to 500 vertices or less and no more than 2,000 square kilometers. You can ask fio-map to print true or false or ask fio-filter to screen out features that don’t meet the criteria.

Input:

fio cat rmnp.geojson \
| fio map -r '& (< (vertex_count g) 500) (< (area g) 2000e6)'

Output:

true

Note that the value returned by the builtin area function has units of meters squared. The area of the feature in the rmnp.geojson file is 1,117.4 square kilometers and has 301 vertices.

Creating search geometries on the command line

What if you want to do a quick search that isn’t related to any particular GIS dataset? The new fio-map command can help.

All of Shapely’s geometry type constructors are available in fio-planet expressions and can be used to create new shapes directly on the command line. Remember that fio-map’s --raw/-r option specifies that the outputs of the command will not be wrapped in GeoJSON feature objects, but returned in their natural, raw form. Another fio-map option, --no-input/-n, specifies that the given expression will be mapped over the sequence [None], as with jq --null-input/-n, producing a single output. Together, these options let fio-map produce new GeoJSON data that is not based on any existing data. On the command line you can combine the options as -nr or -rn. If it helps, think of -rn as “right now”.

Input:

fio map -rn '(Point -105.615 40.255)'

Output:

{"type": "Point", "coordinates": [-105.615, 40.255]}

This value can be saved to a file and used with the planet-data-filter program.

fio map -rn '(Point -105.615 40.255)' > search.geojson
planet data filter --geom search.geojson > filter.json
planet data search PSScene --filter filter.json

Or it can be used inline to make a search without saving any intermediate files at all using POSIX command substitution with the $(...) syntax, although nested substitution takes a heavy toll on the readability of commands. Please speak with your tech lead before putting pipelines like the one below into production.

planet data search PSScene --filter="$(planet data filter --geom="$(fio map -rn '(Point -105.615 40.255)')")"

Creating and simplifying GeoJSON features for use with the Planet CLI are two of the applications for fio-planet’s map, filter, and reduce commands. You can surely think of more! Combine them in different ways to create new pipelines suited to your workflows and share your experience with others in Planet’s Developers Community or in the fio-planet discussion forum.

Next Steps

Keep up to date with the fio-planet GitHub repository and the latest project documentation. Chat with us about your experiences and issues using Fiona and fio-planet with Planet data at https://community.planet.com/developers-55. Follow us on Twitter @PlanetDevs. Sign up for Wavelengths, the Planet Developer Relations newsletter, to get more information on the tech behind the workflows.