This lesson is in the early stages of development (Alpha version)

Introduction to Geospatial Raster and Vector Data with Python

Key Points

Introduction to Raster Data
  • Raster data is pixelated data where each pixel is associated with a specific location.

  • Raster data always has an extent and a resolution.

  • The extent is the geographical area covered by a raster.

  • The resolution is the area covered by each pixel of a raster.

Introduction to Vector Data
  • Vector data structures represent specific features on the Earth’s surface along with attributes of those features.

  • Vector objects are either points, lines, or polygons.

Coordinate Reference Systems
  • All geospatial datasets (raster and vector) are associated with a specific coordinate reference system.

  • A coordinate reference system includes datum, projection, and additional parameters specific to the dataset.

The Geospatial Landscape
  • Many software packages exist for working with geospatial data.

  • Command-line programs allow you to automate and reproduce your work.

  • JupyterLab provides a user-friendly interface for working with Python.

Intro to Raster Data in Python
  • The GeoTIFF file format includes metadata about the raster data.

  • rioxarray stores CRS information as a CRS object that can be converted to an EPSG code or PROJ4 string.

  • The GeoTIFF file may or may not store the correct no data value(s).

  • We can find the correct value(s) in the raster’s external metadata or by plotting the raster.

  • rioxarray and xarray are for working with multidimensional arrays like pandas is for working with tabular data with many columns

Reproject Raster Data with Rioxarray
  • In order to plot or do calculations with two raster data sets, they must be in the same CRS.

  • rioxarray and xarray provide simple syntax for accomplishing fundamental geospatial operations.

  • rioxarray is built on top of rasterio, and you can use rasterio directly to accomplish fundamental geospatial operations.

Raster Calculations in Python
  • Python’s built in math operators are fast and simple options for raster math.

  • numpy.digitize can be used to classify raster values in order to generate a less complicated map.

  • DataArrays can be created from scratch from numpy arrays as well as read in from existing files.

Work With Multi-Band Rasters in Python FIXME
  • A single raster file can contain multiple bands or layers.

  • Individual bands within a DataArray can be accessed, analyzed, and visualized using the same plot function as single bands.

Open and Plot Shapefiles in Python
  • Shapefile metadata include geometry type, CRS, and extent.

  • Load spatial objects into Python with the geopandas.read_file() method.

  • Spatial objects can be plotted directly with geopandas.GeoDataFrame.plot().

Explore and Plot by Shapefile Attributes
  • A GeoDataFrame in geopandas is similar to standard pandas data frames and can be manipulated using the same functions.

  • Almost any feature of a plot can be customized using the various functions and options in the matplotlib package.

Plot Multiple Shapefiles with Geopandas FIXME
  • Use the matplotlib.pyplot.axis object to add multiple layers to a plot.

  • Multi-layered plots can combine raster and vector datasets.

Convert from .csv to a Shapefile in Python FIXME
  • Know the projection (if any) of your point data prior to converting to a spatial object.

Intro to Raster Data in Python FIXME
  • In order to plot two vector data sets together, they must be in the same CRS.

  • Use the GeoDataFrame.to_crs() method to convert between CRSs.

Manipulate Raster Data in Python FIXME
Raster Time Series Data in Python FIXME
Derive Values from Raster Time Series FIXME
Create Publication-quality Graphics FIXME

Glossary

FIXME