Getting started with GDAL

Introduction to an incredibly useful library & Command Line Interface (CLI)

Introduction

As with most data, being able to query, transform, and visualize geospatial data makes your work easier and your data more useful to you. If you're new to working with and using geospatial data, though, the world of geo-tools available can be overwhelming.

In this tutorial, we'll learn about a powerful, widely-used, and broadly applicable Free & Open Source Software (FOSS) tool that will empower you to do more with your data.

GDAL

GDAL, also known as GDAL/OGR, is a library of tools used for manipulating geospatial data. GDAL works on both raster and vector data types, and is an incredible useful tool to be familiar with when working with geospatial data. While the GDAL library can be used programmatically, GDAL also includes a CLI (Command Line Interface). For the purposes of this tutorial, we will be focusing on the CLI only.

Some common uses you may have for GDAL include: quickly getting basic information about a dataset, converting between geospatial file types, clipping one dataset against another, and more. We'll walk through a few demos of these and other common use-cases later on in the examples section.

A note about names

Traditionally, the name "GDAL" was used to refer to the raster-related half of the library, while "OGR" referred to the vector part. You may commonly hear people use both "GDAL" and "GDAL/OGR" to refer to this library - but in all cases, the toolset being referenced is the same.

Installation

Visit this page for installation & setup instructions. When you're finished, come back to this page to learn more.

Commonly-used commands

Once you've verified GDAL has been installed successfully, let's take a look at the different CLI commands you will use in this tutorial:

  • ogrinfo Get information about a vector dataset
  • gdalinfo Get information about a raster dataset
  • ogr2ogr Convert vector data between file formats
  • gdal_translate Convert raster data between file formats

There are other commands available, but these four are the most common. We'll start by learning some of the most-common ways you might use these commands.

Examples

Because GDAL is a CLI, all of your interaction with it will be via your command line (or Terminal). For Windows users who followed this tutorial's installation instructions, you'll use the OSGeo4W Shell here.

If you need a refresher course on types of geospatial data, check out this tutorial.

Working with vector data

GDAL can read dozens of file types. To see a full list of the vector data filetypes supported by your current GDAL installation, do:

ogrinfo --formats

In the following examples, we'll use GeoJSON files -- but you could also read KML, Shapefile, Geopackage, and more.

Exploring data

To get basic information about your dataset, do:

ogrinfo mydata.geojson

Which will return something like this:

INFO: Open of `mydata.geojson'
  using driver `GeoJSON' successful.
1: mydata (Polygon)

What this tells you:

Your dataset, mydata.geojson, has one layer of data, called mydata, which contains Polygon features.


To learn more about that layer of Polygon features, do:

ogrinfo -so mydata.geojson mydata

Which will return something like his:

INFO: Open of `mydata.geojson'
      using driver `GeoJSON' successful.

Layer name: mydata 
Geometry: Polygon
Feature Count: 1
ExtentV: (-86.484375, 18.979026) - (-14.414062, 52.482780)
Layer SRS WKT:
GEOGCS["WGS 84",
    DATUM["WGS_1984",
        SPHEROID["WGS 84",6378137,298.257223563,
            AUTHORITY["EPSG","7030"]],
        AUTHORITY["EPSG","6326"]],
    PRIMEM["Greenwich",0,
        AUTHORITY["EPSG","8901"]],
    UNIT["degree",0.0174532925199433,
        AUTHORITY["EPSG","9122"]],
    AUTHORITY["EPSG","4326"]]

What this tells you:

  1. There is 1 feature (a single polygon) in this dataset's layer
  2. The Extent (bounding box) of the dataset is given as a coordinate pair
  3. The SRS (spatial reference) of this dataset, printed as WKT (Well-Known Text)

Converting data

One of the most convenient features of GDAL is its ability to quickly and painlessly convert data between file types. In fact, even if you use GDAL for nothing else, data conversion alone is worth the price of admission.

To convert a vector dataset from one format to another, the basic command pattern is:

ogr2ogr -f <output format> <destination filename> <source filename>

To see a list of valid output formats, do:

ogr2ogr --formats

Note: the formats supported by a given GDAL instance can vary, depending on the installation process -- but generally, support for common filetypes like Shapefile, GeoJSON, etc. will always be available.

Let's say we have a shapefile, mydata.shp, which we'd like to convert to a GeoJSON dataset. To do this, we would run:

ogr2ogr -f GeoJSON myconverteddata.geojson mydata.shp

Working with raster data

Just as we learned that GDAL can read and write dozens of vector data types, the same is true for raster data. To see a full list of the raster data filetypes supported by your current GDAL installation, do:

gdalinfo --formats

For the following examples, we'll use GeoTIFF data for demonstration purposes.

Exploring data

To get information about your dataset, do:

ogrinfo mydata.tif

Which will return something like this:

Driver: GTiff/GeoTIFF
Files: mydata.tif
Size is 8879, 4392
Coordinate System is:
PROJCS["WGS 84 / UTM zone 38N",
GEOGCS["WGS 84",
DATUM["WGS_1984",
    SPHEROID["WGS 84",6378137,298.257223563,
        AUTHORITY["EPSG","7030"]],
    AUTHORITY["EPSG","6326"]],
PRIMEM["Greenwich",0,
    AUTHORITY["EPSG","8901"]],
UNIT["degree",0.0174532925199433,
    AUTHORITY["EPSG","9122"]],
AUTHORITY["EPSG","4326"]],
PROJECTION["Transverse_Mercator"],
PARAMETER["latitude_of_origin",0],
PARAMETER["central_meridian",45],
PARAMETER["scale_factor",0.9996],
PARAMETER["false_easting",500000],
PARAMETER["false_northing",0],
UNIT["metre",1,
AUTHORITY["EPSG","9001"]],
AXIS["Easting",EAST],
AXIS["Northing",NORTH],
AUTHORITY["EPSG","32638"]]
Origin = (494265.000000000000000,2729640.000000000000000)
Pixel Size = (3.000000000000000,-3.000000000000000)
Metadata:
AREA_OR_POINT=Area
TIFFTAG_DATETIME=2017:10:03 06:53:18
Image Structure Metadata:
COMPRESSION=LZW
INTERLEAVE=PIXEL
Corner Coordinates:
Upper Left  (  494265.000, 2729640.000) ( 44d56'35.92"E, 24d40'52.01"N)
Lower Left  (  494265.000, 2716464.000) ( 44d56'36.12"E, 24d33'43.62"N)
Upper Right (  520902.000, 2729640.000) ( 45d12'23.78"E, 24d40'51.54"N)
Lower Right (  520902.000, 2716464.000) ( 45d12'23.07"E, 24d33'43.14"N)
Center      (  507583.500, 2723052.000) ( 45d 4'29.72"E, 24d37'17.79"N)
Band 1 Block=256x256 Type=UInt16, ColorInterp=Red
NoData Value=0
Overviews: 2960x1464, 987x488, 329x163
Band 2 Block=256x256 Type=UInt16, ColorInterp=Green
NoData Value=0
Overviews: 2960x1464, 987x488, 329x163
Band 3 Block=256x256 Type=UInt16, ColorInterp=Blue
NoData Value=0
Overviews: 2960x1464, 987x488, 329x163
Band 4 Block=256x256 Type=UInt16, ColorInterp=Undefined
NoData Value=0
Overviews: 2960x1464, 987x488, 329x163

What this tells you:

Converting data

We can also use GDAL to convert raster data from one file format to another, just as we learned to do with vector data.

To convert a raster dataset from one format to another, the basic command pattern is:

gdal_translate -of <output format> <source filename> <destination filename>

Note! this command is similar, but not quite the same, as ogr2ogr: notice the output format is indicated by -of, and the ordering of source & destination filenames is reversed from the ogr2ogr command.

To see a list of valid output formats, do:

gdal_translate --formats

As an example, let's transform a GeoTIFF raster dataset mydata.tif into a georeferenced PNG. To do that, we would run:

gdal_translate -of png mydata.tif myconverteddata.png