This lesson is in the early stages of development (Alpha version)

Introduction to Geospatial Raster and Vector Data with Python

Key Points

Introduction to Raster Data
  • Raster data is pixelated data where each pixel is associated with a specific location.

  • Raster data always has an extent and a resolution.

  • The extent is the geographical area covered by a raster.

  • The resolution is the area covered by each pixel of a raster.

Introduction to Vector Data
  • Vector data structures represent specific features on the Earth’s surface along with attributes of those features.

  • Vector objects are either points, lines, or polygons.

Coordinate Reference Systems
  • All geospatial datasets (raster and vector) are associated with a specific coordinate reference system.

  • A coordinate reference system includes datum, projection, and additional parameters specific to the dataset.

The Geospatial Landscape
  • Many software packages exist for working with geospatial data.

  • Command-line programs allow you to automate and reproduce your work.

  • JupyterLab provides a user-friendly interface for working with Python.

Intro to Raster Data in Python
  • The GeoTIFF file format includes metadata about the raster data.

  • rioxarray stores CRS information as a CRS object that can be converted to an EPSG code or PROJ4 string.

  • The GeoTIFF file may or may not store the correct no data value(s).

  • We can find the correct value(s) in the raster’s external metadata or by plotting the raster.

  • rioxarray and xarray are for working with multidimensional arrays like pandas is for working with tabular data with many columns

Reproject Raster Data with Rioxarray
  • In order to plot or do calculations with two raster data sets, they must be in the same CRS.

  • rioxarray and xarray provide simple syntax for accomplishing fundamental geospatial operations.

  • rioxarray is built on top of rasterio, and you can use rasterio directly to accomplish fundamental geospatial operations.

Raster Calculations in Python
  • Python’s built-in math operators are fast and simple options for raster math.

  • numpy.digitize can be used to classify raster values in order to generate a less complicated map.

  • DataArrays can be created from scratch from numpy arrays as well as read in from existing files.

Open and Plot Shapefiles in Python
  • Shapefile metadata include geometry type, CRS, and extent.

  • Load spatial objects into Python with the geopandas.read_file() method.

  • Spatial objects can be plotted directly with geopandas.GeoDataFrame.plot().

Plot Multiple Shapefiles with Geopandas FIXME
  • Use the matplotlib.pyplot.axis object to add multiple layers to a plot.

  • Multi-layered plots can combine raster and vector datasets.

Convert from .csv to a Shapefile in Python
  • Know the projection (if any) of your point data prior to converting to a spatial object.

  • This projection information can be used to convert a text file with spatial columns into a shapefile (or GeoJSON) with geopandas.

Calculating Zonal Statistics on Rasters
Intro to Raster Data in Python FIXME
  • In order to plot two vector data sets together, they must be in the same CRS.

  • Use the GeoDataFrame.to_crs() method to convert between CRSs.

Manipulate Raster Data in Python FIXME
Work With Multi-Band Rasters in Python
  • A single raster file can contain multiple bands or layers.

  • Individual bands within a DataArray can be accessed, analyzed, and visualized using the same plot function as single bands.

Derive Values from Raster Time Series FIXME
Raster Time Series Data in Python FIXME
Explore and Plot by Shapefile Attributes
  • A GeoDataFrame in geopandas is similar to standard pandas data frames and can be manipulated using the same functions.

  • Almost any feature of a plot can be customized using the various functions and options in the matplotlib package.