By: Chris Holmes on May 26 2022
Planet and cloud-native geospatial open source
After Sara’s announcement of our new blog, I have the honor of writing the second substantive post on this blog. I’ve been at Planet for a long time and have always felt developers are our most important users. So I’m pleased to share that just recently I shifted my role to become the Product Manager of the Developer Relations team that Sara leads. Most exciting for me is that we’ve expanded the scope of the team to include what we call “Open Initiatives,” one of which is “Cloud-Native Geospatial,” encompassing all the work I’ve been doing on things like SpatioTemporal Asset Catalogs (STAC) and Cloud-Optimized GeoTIFFs (COG), plus new topics like GeoParquet.
A lot of my time recently went into organizing the Cloud-Native Geospatial Outreach Event that happened last month. Planet was a top sponsor, and a number of Planeteers gave talks. It’s super cool to watch the videos of the talks and to see how the community just continues to explode. With over 1600 registrations, I think we’ll see another jump in momentum after the event. I wanted to share a bit about Planet’s pioneering role in Cloud-Native Geospatial, as well as what we’re working on next and why we’re excited about this great ecosystem.
Planet and the genesis of Cloud-Native Geospatial
Planet was lucky to be among the first “cloud-native” satellite imagery providers (perhaps even the first). It was really a matter of timing, as Planet was founded right when any sensible Silicon Valley startup trying to achieve scale moved to the cloud. At that time, a standard image processing pipeline would involve image processing experts using desktop software to produce the imagery for customers. But Planet had huge aspirations of scale, with the mission to “image the whole earth every day” at the center of what everyone did. The amount of data coming in from Planet’s planned constellation meant that everything needed to be automated. So Planet just built its data pipeline and data hosting platform out right, and became a big supporter of the “cloud-native geospatial” movement before it even had a name.
The movement clearly started with the advent of the Cloud-Optimized GeoTIFF, which Planet played a key role in creating. The idea behind COG was discussed and built out in the AWS Landsat Public Dataset project, with Planet as a key contributor. Then it came together as a standard in a meeting I remember at Planet headquarters. We had a whiteboard session with Frank Warmerdam of GDAL fame, Matt Hancher who co-founded Google Earth Engine, and Rob Emanuele who led RasterFoundry at Azavea and now leads engineering on Microsoft Planetary Computer. We wanted a format that Planet could produce and that would be streamable into these two new cloud-native geospatial compute engines. And one that is ideally backwards compatible with a standard GeoTIFF, so it would still work for local workflows. Planet then funded Even Rouault to create the original specification and document the GDAL drivers.
Planet also worked on the evolution of SpatioTemporal Asset Catalogs (STAC), which started when Radiant Earth convened a diverse group of geospatial experts and organizations in Boulder, CO to collaborate on the interoperability of data catalogs. I recently posted on the history of Planet’s support of STAC. Planet’s role in STAC is one of the things I’m most proud of, and it’s fun to see it integrating into Planet’s API’s.
Why we support Cloud-Native Geospatial
Planet supports cloud-native geospatial because our imagery must be much more accessible to have the impact we aspire for. I’d like to explain a bit more as to why we support this ecosystem.
There are two critical economic shifts transforming the world:
- The Digital Transformation, where organizations are using Big Data and Artificial Intelligence to understand what they do and to do it more efficiently
- The Sustainability Transformation, where data about our planet is key to valuing natural systems in the economy
Geospatial Information is useful for many organizations found in either or both of these movements. But the benefit will not be realized if everyone must become experts in remote sensing and GIS. It is incumbent upon us to make information about the earth accessible and integrated into the workflows people use everyday. And the biggest challenges always need more data sources, combined in insightful ways. Planet’s APIs and data formats need to be in the formats, tools, and channels used to create solutions that make a difference.
Cloud-native geospatial has the potential to make geospatial data far more accessible within existing workflows and architectures. By doing so, users don’t need to be experts in remote sensing and GIS. They just need to understand how to work with data.
Making Planet’s data more accessible to developers
Planet is working hard to ensure the developer experience is as solid as possible. The headline news of 8-band data availability is certainly cool, but what I’ve personally found especially impressive is how the team has greatly improved the quality of the imagery and reduced the complexity to access it. Improvements in our data pipeline include better alignment between pixels, a reduction in the number of artifacts, and a sharpening of the visual quality. And the new PSScene product simplifies and future-proofs how users and developers access imagery. Not quite as new, but seeing substantial adoption, is the Subscriptions API, which greatly simplifies the development time to integrate any monitoring workflow with Planet. Another great feature is the new harmonization tool, one of the key operations for Planet’s delivery tools in providing full Analysis Ready Data in an On-Demand workflow. Together these improvements are a huge step towards data that “just works,” enabling developers to order a atmospherically-corrected, pixel- and sensor-aligned stack of imagery for time series analysis without having to even think about all the complexities of remote sensing.
The next frontier in making Planet even more accessible to developers is the higher level data products that directly extract insights from satellite imagery. For example, Planet offers our Road and Building Change Analytics and what we call “Planetary Variables”, including Soil Water Content, Land Surface Temperature, and a proxy for Vegetation Biomass. These Planetary Variables go beyond just Planet’s imagery, fusing several different data sources, and will open up new use cases. Planet moving further up the stack, and into the “vector” area of geospatial, means much more access for new developers. And there are some interesting interoperability opportunities that we hope to contribute to.
What’s next for Planet and Cloud-Native Geospatial?
Planet data generates far more insight when it can be combined with other data. So we believe in helping create an ecosystem of tools and data that have interoperability at their core. One route is supporting tools like GDAL and Rasterio, which translate between any format. Another route is work toward the interoperability of the next generation of workflows, as COG and STAC do.
Building our team
Planet has recently been increasing its resources working on developer relations and cloud-native geospatial, including bringing on folks to work full-time on “open initiatives.”
- Sean Gillies joined the developer relations team a few months ago. Planet is supporting his time on Rasterio, Shapely, and Fiona—some of the most important geospatial tools in the Python ecosystem. The other project he’s helping out on is a new version of Planet’s Python client and command-line library, which you’ll hear about soon on this blog.
The developers relations team has also had a number of other great new hires, who you’ll hear from on this blog.
COG and STAC
In order for cloud-native geospatial to reach its full potential, COG and STAC will have to “cross the chasm” to mainstream adoption. To do so, we want to help everyone get a sense of just how much data there is in STAC and COG. If we can’t measure how much data is available in these new formats then we won’t be able to actually track progress and determine which of the various initiatives are working. To start, we’re focused on a crawler pointed at STAC Catalogs that reports back stats on the number of STAC Items, STAC Extensions used, and what format (COG, JP2K, Zarr, etc.) the assets are stored in. This, in turn, will help inform a new STAC extension making reporting easier, so we’re not having to hand crawl tens of millions of STAC records. We hope to report on the overall STAC and COG data holdings that anyone can access. We’ll work to integrate with STACIndex.org, which is where this idea originally came from.
To support this, Tim has revived the Planet go-stac open source library, getting it up to speed with STAC 1.0.0, and improving its crawling and validation capabilities. It’s now capable of very fast recursive crawling, with more improvements coming soon.
We are also starting to use the library internally to build and deploy Planet’s STAC Catalog of open data.
Another area where we’ll be focused is the standards that can make our higher-level “Planetary Variables” data more interoperable. The goal is to have these available in “vector” data formats—the points, lines, and polygons that can be represented as rows in a database. Indeed, one end goal would be delivering daily information as simple tabular values against known geometries. This would mean a user already has the geometry of a state, county or even a field and they’d just get a daily update—say, of the plant biomass or soil moisture reading for the day.
Cloud data warehouses like BigQuery, Snowflake, and Redshift are driving a revolution in how organizations handle all of their data, and all have native geospatial support. So there is an opportunity to fit Planet’s daily variables about our earth directly into the workflows being used today. This has led us to helping seed efforts like GeoParquet. Javier de la Torre, from Carto, wrote a great overview introducing GeoParquet. Planet is working to build the community and the specification, and we funded the development of the GDAL/OGR reader that is included in GDAL 3.5.0. An interoperable format would enable us to publish data once and stream to a variety of cloud tools.
Next step: Join us
We’ve gotten this far by collaborating with others, and I think the opportunity to make geospatial more accessible is limitless. We hope others will join us in open collaboration. If you’d like to continue to get updates on what we’re up to in cloud-native geospatial, as well as all kinds of content about our API’s, tools, and the geo technical developer community in general, please follow our blog. And do check out our contributions to the Cloud-Native Geospatial Outreach Event, see the playlist on youtube for all the content.
Except for the historical information contained herein, the matters set forth in this blog are forward-looking statements within the meaning of the "safe harbor" provisions of the Private Securities Litigation Reform Act of 1995, including, but not limited to, the Company’s ability to capture market opportunity and realize any of the potential benefits from current or future product enhancements, new products, or strategic partnerships and customer collaborations. Forward-looking statements are based on the Company’s management’s beliefs, as well as assumptions made by, and information currently available to them. Because such statements are based on expectations as to future events and results and are not statements of fact, actual results may differ materially from those projected. Factors which may cause actual results to differ materially from current expectations include, but are not limited to the risk factors and other disclosures about the Company and its business included in the Company's periodic reports, proxy statements, and other disclosure materials filed from time to time with the Securities and Exchange Commission (SEC) which are available online at www.sec.gov, and on the Company's website at www.planet.com. All forward-looking statements reflect the Company’s beliefs and assumptions only as of the date such statements are made. The Company undertakes no obligation to update forward-looking statements to reflect future events or circumstances.