Open and Plot Vector Layers

Last updated on 2024-09-10 | Edit this page

Overview

Questions

  • How can I read, examine and visualize point, line and polygon vector data in R?

Objectives

After completing this episode, participants should be able to…

  • Differentiate between point, line, and polygon vector data.
  • Load vector data into R.
  • Access the attributes of a vector object in R.

Prerequisite

In this lesson you will work with the sf package. Note that the sf package has some external dependencies, namely GEOS, PROJ.4, GDAL and UDUNITS, which need to be installed beforehand. Before starting the lesson, follow the workshop setup instructions for the installation of sf and its dependencies.

First we need to load the packages we will use in this lesson. We will use the tidyverse package with which you are already familiar from the previous lesson. In addition, we need to load the sf package for working with spatial vector data.

R

library(tidyverse)  # wrangle, reshape and visualize data
library(sf)         # work with spatial vector data

The sf package

sf stands for Simple Features which is a standard defined by the Open Geospatial Consortium for storing and accessing geospatial vector data. PostGIS, for instance, uses the same standard, so PostGIS users might recognise the functions used by the sf package.

Import shapefiles


Let’s start by opening a shapefile. Shapefiles are a common file format to store spatial vector data used in GIS software. Note that a shapefile consists of multiple files and it is important to keep them all together in the same location. We will read a shapefile with the administrative boundary of Delft with the function st_read() from the sf package.

R

boundary_Delft <- st_read("data/delft-boundary.shp", quiet = TRUE)

All sf functions start with st_

Note that all functions from the sf package start with the standard prefix st_ which stands for Spatial Type. This is helpful in at least two ways:

  1. it allows for easy autocompletion of function names in RStudio, and
  2. it makes the interaction with or translation to/from software using the simple features standard like PostGIS easy.

Spatial Metadata


By default (with quiet = FALSE), the st_read() function provides a message with a summary of metadata about the file that was read in. To examine the metadata in more detail, we can use other, more specialised, functions from the sf package.

The st_geometry_type() function, for instance, gives us information about the geometry type, which in this case is POLYGON.

R

st_geometry_type(boundary_Delft)

OUTPUT

[1] POLYGON
18 Levels: GEOMETRY POINT LINESTRING POLYGON MULTIPOINT ... TRIANGLE

The st_crs() function returns the coordinate reference system (CRS) used by the shapefile, which in this case is WGS 84 and has the unique reference code EPSG: 4326.

R

st_crs(boundary_Delft)

OUTPUT

Coordinate Reference System:
  User input: WGS 84
  wkt:
GEOGCRS["WGS 84",
    DATUM["World Geodetic System 1984",
        ELLIPSOID["WGS 84",6378137,298.257223563,
            LENGTHUNIT["metre",1]]],
    PRIMEM["Greenwich",0,
        ANGLEUNIT["degree",0.0174532925199433]],
    CS[ellipsoidal,2],
        AXIS["latitude",north,
            ORDER[1],
            ANGLEUNIT["degree",0.0174532925199433]],
        AXIS["longitude",east,
            ORDER[2],
            ANGLEUNIT["degree",0.0174532925199433]],
    ID["EPSG",4326]]

Examining the output of st_crs()

As the output of st_crs() can be long, you can use $Name and $epsg after the crs() call to extract the projection name and EPSG code respectively.

The st_bbox() function shows the extent of the layer. As WGS 84 is a geographic CRS, the extent of the shapefile is displayed in degrees.

R

st_bbox(boundary_Delft)

OUTPUT

     xmin      ymin      xmax      ymax
 4.320218 51.966316  4.407911 52.032599 

We need a projected CRS, which in the case of the Netherlands is typically the Amersfort / RD New projection. To reproject our shapefile, we will use the st_transform() function. For the crs argument we can use the EPSG code of the CRS we want to use, which is 28992 for the Amersfort / RD New projection. To check the EPSG code of any CRS, we can check this website: https://epsg.io/

R

boundary_Delft <- st_transform(boundary_Delft, 28992)
st_crs(boundary_Delft)$Name

OUTPUT

[1] "Amersfoort / RD New"

R

st_crs(boundary_Delft)$epsg

OUTPUT

[1] 28992

Notice that the bounding box is measured in meters after the transformation.

R

st_bbox(boundary_Delft)

OUTPUT

     xmin      ymin      xmax      ymax
 81743.00 442446.21  87703.78 449847.95 

R

st_crs(boundary_Delft)$units_gdal

OUTPUT

[1] "metre"

We confirm the transformation by examining the reprojected shapefile.

R

boundary_Delft

OUTPUT

Simple feature collection with 1 feature and 1 field
Geometry type: POLYGON
Dimension:     XY
Bounding box:  xmin: 81743 ymin: 442446.2 xmax: 87703.78 ymax: 449848
Projected CRS: Amersfoort / RD New
  osm_id                       geometry
1 324269 POLYGON ((87703.78 442651, ...

Callout

Read more about Coordinate Reference Systems here. We will also practice transformation between CRS in Handling Spatial Projection & CRS.

Plot a vector layer


Now, let’s plot this shapefile. You are already familiar with the ggplot2 package from Introduction to Visualisation. ggplot2 has special geom_ functions for spatial data. We will use the geom_sf() function for sf data.

R

ggplot(data = boundary_Delft) +
  geom_sf(size = 3, color = "black", fill = "cyan1") +
  labs(title = "Delft Administrative Boundary") +
  coord_sf(datum = st_crs(28992)) # displays the axes in meters

Challenge: Import line and point vector layers

Read in delft-streets.shp and delft-leisure.shp and assign them to lines_Delft and points_Delft respectively. Answer the following questions:

  1. What type of R spatial object is created when you import each layer?
  2. What is the CRS and extent for each object?
  3. Do the files contain points, lines, or polygons?
  4. How many features are in each file?

R

lines_Delft <- st_read("data/delft-streets.shp")
points_Delft <- st_read("data/delft-leisure.shp")

We can check the type of type of geometry with the st_geometry_type() function. lines_Delft contains "LINESTRING" geometry and points_Delft is made of "POINT" geometries.

R

st_geometry_type(lines_Delft)[1]

OUTPUT

[1] LINESTRING
18 Levels: GEOMETRY POINT LINESTRING POLYGON MULTIPOINT ... TRIANGLE

R

st_geometry_type(points_Delft)[2]

OUTPUT

[1] POINT
18 Levels: GEOMETRY POINT LINESTRING POLYGON MULTIPOINT ... TRIANGLE

lines_Delft and points_Delft are in the correct CRS.

R

st_crs(lines_Delft)$epsg

OUTPUT

[1] 28992

R

st_crs(points_Delft)$epsg

OUTPUT

[1] 28992

When looking at the bounding boxes with the st_bbox() function, we see the spatial extent of the two objects in a projected CRS using meters as units. lines_Delft() and points_Delft have similar extents.

R

st_bbox(lines_Delft)

OUTPUT

     xmin      ymin      xmax      ymax
 81759.58 441223.13  89081.41 449845.81 

R

st_bbox(points_Delft)

OUTPUT

     xmin      ymin      xmax      ymax
 81863.21 442621.15  87370.15 449345.08 

Key Points

  • Metadata for vector layers include geometry type, CRS, and extent.
  • Load spatial objects into R with the st_read() function.
  • Spatial objects can be plotted directly with ggplot2 using the geom_sf() function. No need to convert to a data frame.