So....What actually is a TIFF file anyway?¶
TIFF stands for Tag Image File Format, a format for storing raster graphic images. It was designed to establish a general consensus on a common scanned image file format.
TIFF Header¶
The header of a TIFF file contains things like the TIFF version and where to find the first Image File Directory in the file.
Image File Directories and Data¶
What Makes a GeoTIFF Different?¶
A GeoTIFF file has some additional tags containing information about where the image exists on earth. his includes things like map projections, coordinate systems, datums, etc. Basically lots of interesting geodspatial relevant info that provides a spatial reference.
What is a Cloud Optimized GeoTIFF, and why is it important?¶
A cloud optimized GeoTiff, or COG, is a type of GeoTIFF file that has been formatted to work on the cloud. So why is this format so important?
The Why¶
- Analyze Problems on a Global Scale¶
- Handle Exponential Growth of Data¶
- Allow Efficient Streaming of Data - Reduce Data Duplication¶
- Democratizing Data Science - Make Geospatial Data More Accessible and Available¶
The How¶
Cloud Optimized GeoTIFFs are the gold standard format for storing raster data in cloud storage. Why? Well, it comes down to how the format is structured internally. COGs fundamentally depend on technology that work in conjunction with each other. The manner in which pixels are organized (internal tiling, overviews, compression) makes it easier for users to access parts of the data corresponding to their particular area of interest, without needing to download the entire file first.
Organization¶
Internal Tiling¶
In a COG, pixels are stored in tiles. This creates the ability to access just the part of a file that is needed because all relevant information is stored together in each tile.
Overviews¶
Overviews form zoomed out, or lower resolution, versions of the original image. This is what is refered to as downsampling. You increase the size of the cells in the grid, making the image smaller and have less detail. A GeoTIFF file will often have multiple overviews, each with a different zoom level
Overviews are really useful when a client wants to quickly render an image of the whole file. This can be done very efficiently because the client doesn't need to request the original resolution image, but can instead request one of the overviews that are available to obtain a quick preview of the image very fast.
Visualizing Overviews Using Rasterio¶
Here we will visualize the overviews of the blue, green, and red bands of two sample COGs with an AOI centered around midtown Manhattan.
import rasterio
import os
import matplotlib.pyplot as plt
from rasterio.plot import show
import numpy as np
%matplotlib inline
def generate_overviews(image):
with rasterio.open(image) as src:
# Iterate over the blue, green, and red bands
for band in src.indexes[:-1]:
# Get list of overview levels for the band
cog_overview = src.overviews(band)
# Iterate over each level in reverse order
for i in range(len(cog_overview)-1, -1, -1):
overview_image = src.read(band, out_shape=(int(src.height/cog_overview[i]), int(src.width/cog_overview[i])))
# Apply a boolean mask to convert all 0 values in the array to nan
overview_image = overview_image.astype(float)
overview_image[np.where(overview_image==0)] = np.nan
# Plot each overview
plt.figure(figsize=(20, 15))
plt.xlabel("Columns", fontsize=20)
plt.ylabel("Rows", fontsize=20)
plt.title('Overview: Band: {} -- Zoom Level {} -- Modified height: {}, Modified Width: {}'.format(band, cog_overview[i], overview_image.shape[0], overview_image.shape[1]), fontsize=20)
plt.imshow(overview_image)
plt.show()
# Can replace with any sample COG of your choosing
analytic_image = os.getcwd() + '/20210514_145807_70_2455_3B_AnalyticMS.tif'
visual_image = os.getcwd() + '/'+ '4478803_1857818_2021-05-14_227b_RGB_Visual.tif'
Let's take a look at the analytic image. For each band, we are taking a decimated read of the image at the different overviews, or zoom factors(often seen in factors of 2). We can see the image transition from from a quite detailed view to a progressively coarser, less refined resolution. The visual effects become most noticeable at the highest overview level.
overviews = generate_overviews(analytic_image)