Organizing Camera Trap Data
Overview
Teaching: 60 min
Exercises: 30 minQuestions
How can we use program R and package camTrapR to organize camera trap data?
Objectives
Perform camera trap organizational steps like renaming files
Extract exif data from camera traps
Combine dataframes with locational information
In camera trapping studies, it’s common to have a lot of camera photos, sometimes thousands or even millions of images that need to be processed and formatted for analysis tasks. Camera trap data organization requires careful management and data extraction that can be greatly assisted by the use of programming tools, like program R and Python.
In general, if we want to organize our data from raw camera trapping data, there will also be other files including GPS locations, and camera operations data including start-times, end times and any issues encountered that may have interrupted camera data capture such as repairs.
We will begin this module by first organizing and formatting our camera trap data that have already been sorted for the snow leopard species. Our goal is to sort these images by individual, and so our first steps will be processing and organizing the raw camera trap information.
First, set the working directory for the workshop where the snow leopard data have been downloaded, which is essentially a folder of your camera trap imagery.
#Set working directory
setwd("YourWorkingDirectory/CarpentriesforCameraTraps/")
#Make sure to have ExifTool installed and bring in the camtrapR package
library(camtrapR)
Set the file path of your image directory
# raw image location
wd_images_raw <- file.path("2012_CTdata")
One of the first steps that we want to perform is a data quality check. Make sure that each of the file folders has the name of the camera trap station, often including the name and number. In our case we have cameras with names like “C1_Avgarch_SL”, which indicates that this camera station was numbered 1, and at a location named Avgarch. These names are consistent across data tables such as the GPS coordinates and camera operations information, making it easier to merge this information.
Since SD cards often name files sequentially like “IMG_1245.jpg”, then there may be more than one file with this name within the multitude of camera trap photos we have captured. Our goal is to give each image a unique filename that uses the location, date/time, and a sequential number, so that the photo filenames are unique. To do this, the folder names have to have the location information because the new image names will have the camera trap location as an identifier.
To create a new directory for our copied data we can use the dir.create function.
In our case, we will name the new folder 2012CameraData_renamed
#create directory
dir.create("2012CameraData_renamed")
Then, set the file.path to an object, which eventually will be used for adding the renamed images.
#get the file path for the new directory and store in an object
wd_images_raw_renamed <- file.path("2012CameraData_renamed")
A quick fix before we rename. Some camera trap models (like Reconyx) do not use the standard Exif metadata information for the date and time, which makes it not possible to read directly, so we use the fixDateTimeOriginal function in the camTrapR package.
#fix date time objects that may not be in standard Exif format
fixDateTimeOriginal(wd_images_raw,recursive = TRUE)
Renaming camera trap files is possible using the imageRename function. Here we specify the input and output directories.
There are additional parameters for whether the directories contain multiple cameras at the station, like an A and B station opposing each other (hasCameraFolders). In our case, our folders do have subdirectories, but they are not specific to a substation, so we will set this to false.
Additional parameters include whether the camera subdirectories should be kept (keepCameraSubfolders), and we do not have extra station or species subdirectories we can also keep this as FALSE. We will set copyImages to TRUE because we want these images to go into a new directory.
#rename images
renaming.table2 <- imageRename(inDir = wd_images_raw,
outDir = wd_images_raw_renamed,
hasCameraFolders = FALSE,
keepCameraSubfolders = FALSE,
copyImages = TRUE)
Next, we will create a record table or dataframe of the exif information, that includes station, species, date/time, and directory information.
If you followed the setup instructions carefully, then you should have no issues with ExifTool which has the backbone tools for extracting exifs. However, you may need to install it. https://exiftool.org/
First you can check if you have the tool installed:
Sys.which("exiftool")
If not, then you can configure it using the path information for where the tool was placed on your computer. It does not need installation, but it does need to be found on your computer.
# this is a dummy example assuming the directory structure: C:/Path/To/Exiftool/exiftool.exe
exiftool_dir <- "C:/Path/To/Exiftool"
exiftoolPath(exiftoolDir = exiftool_dir)
Moving onto our exif extraction. There are parameters to allow the extracted exif dataframe to include available species information, to be sorted from the directory, for example your data may be sorted with this structure: (Station/Species) or (Station/Camera/Species). In our case, we only have one species, snow leopard images, so we will not use these extra settings, although the parameters are available if your data has species folders.
#create a dataframe from exif data that includes date and time
rec.db.species0 <- recordTable(inDir = wd_images_raw_renamed,
IDfrom = "directory")
After inspecting the dataframe, we can see there is a Species column with the wrong information in it, so let’s tell the dataframe which species and genus we are working with.
#change the species column contents
rec.db.species0$Species <- "uncia"
rec.db.species0$Genus <- "Pathera"
To save this table to a csv file we can write this to file, so we have the raw exif data if we need it.
#write the Exif data to file
write.csv(rec.db.species0, "CameraTrapExifData.csv")
Now we have the exif data finished and in a dataframe format.
Next we are going to bring in the data from the GPS coordinates. By loading the Metadata with the GPS coordinates and camera function dataframe into the program.
#load the camera trap GPS and camera function information
WakhanData<-read.csv("Metadata_CT_2012.csv")
We can check out the syntax of the geometry column of our metadata
#look at the synatx of the geometry column for GPS coordinates
WakhanData$Loc_geo[1]
When we inspect these data, two empty rows have no information, so we’ll have to clean this up a bit. There are several ways of doing this, for one, we can use the complete.cases function. This will remove any rows with NA values anywhere in the matrix.
If your data are complete, this is fine. If they are not then this will subset your dataframe. For our purposes this is fine.
#remove the two rows with missing data
WakhanData<-WakhanData[complete.cases(WakhanData),]
Another factor can see that the location column with the coordinates has a format with the coordinates in one string. So, to fix this we need to do a bit of work to get it into the format that we want. To do this we can use the substr function to get a substring of the data out of the string. Since they are all the same format we can simply pull out the numbers that we want using the place of the characters in the string. For example, to get the lattitude coordinates, we need to pull out the 5th to 11th characters in the string.
# double check the substring of the UTM coordinates to extract
substr(WakhanData$Loc_geo, 5,11)
We can assign these new strings to new columns in our dataframe.
#add the Easting and Northing data to new X and Y columns
WakhanData$X<-substr(WakhanData$Loc_geo, 5,11)
WakhanData$Y<-substr(WakhanData$Loc_geo, 13,nchar(WakhanData$Loc_geo[1]))
Great so we have our Latitude and Longitude coordinates. Let’s now we want to merge the dataframe with the exif data and the dataframe with the GPS coordinates and camera infromation together. Before we can do that, we need to make sure that there is a column in both that match completely. So let’s have a check and see if the trap names in the record table are the same in the GPS coordinates table.
#add the check the camera trap station names between the two dataframes
unique(rec.db.species0$Station)
unique(WakhanData$Trap.site)
From this result we can see nearly all of the camera traps are different because there is an extra _SL at the end of the names, so we can remove it. We can use the stringr package and function str_remove to apply a removal.
#remove characters "_SL" from the record table station names
library(stringr)
rec.db.species0$Station<-str_remove(rec.db.species0$Station, "_SL")
We can use the setdiff function to determine if any of the trap names are still different. Oftentimes, there are misspellings.
#check if the site names are the same
setdiff(unique(rec.db.species0$Station), unique(WakhanData$Trap.site))
We find two cameras have different names. To fix this, we can use the str_replace function in the stringr package
#replace bad station names with correct spellings
WakhanData$Trap.site<-str_replace(WakhanData$Trap.site,"C5_Ishmorg" , "C5_Ishmorgh")
WakhanData$Trap.site<-str_replace(WakhanData$Trap.site,"C18_Khandud" , "C18_Khundud")
setdiff(unique(rec.db.species0$Station), unique(WakhanData$Trap.site))
Now, there is one more problem with our dataset, and that is that our coordinates are only in UTM coordinate system, and we actually need them in a Lat/Long coordinate system to upload them to the Whiskerbook format.
To work with the spatial formats and convert these coordinates, we will use the sf package. We can first create a few objects of the coordinate systems that we will be using.
#load sf package and set coordinate systems to objects
library(sf)
#The coordinate information for Lat/Long is EPSG:4326
wgs84_crs = "+init=EPSG:4326"
#The coordinate information for UTM is EPSG:32643
UTM_crs = "+init=EPSG:32643"
Next, we want to create a shapefile of points of our GPS coordinates that is in the UTM coordinate system.
#convert the GPS coordinates into shapefile points
WakhanData_points<-st_as_sf(WakhanData, coords=c("X","Y"), crs=UTM_crs)
Lets plot them to make sure we did it right and that we did not confuse our X and Y coordinates.
plot(WakhanData_points[,"Year"])
Here we will convert the coordinate system to WGS84 using the st_transform function, which is our handy function for transforming coordinate systems.
#convert the coordinate system from UTM to lat long WGS84
WakhanData_points_latlong<-st_transform(WakhanData_points, crs=wgs84_crs)
Now, we will extract the coordinates from the new transformed points, and put them into an dataframe object named WakhanData_points_latlong_df
WakhanData_points_latlong_df<- st_coordinates(WakhanData_points_latlong)
We will rename the columns of the new dataframe object, Lat and Long.
colnames(WakhanData_points_latlong_df)<-c("Lat","Long")
Then, we can simply add these columns back to the original dataframe.
WakhanData <-cbind(WakhanData, WakhanData_points_latlong_df)
Now that we have the coordinates in the format that we want, then we can merge the two dataframes (for the exif data and the metadata) together using the merge function. We can set the columns we want to match on using the by.x and by.y arguments, and in our case, the column name in the exif data for “Station” is the same as the column name in the metadata for “Trap.site”. They have the same location names in these two columns that will match exactly. We are simply telling the program what those station names are.
Then set the all=FALSE because some of the records in the datatable with the GPS coordinates, we do not have camera data for so we do not need them in the final dataframe. The all argument can be set to true to include all records in both tables, but in this case we only want to merge the data from the first table that matches the second table.
#merge the record table to the GPS coordinates
final_CameraRecords<-merge(rec.db.species0, WakhanData, by.x="Station", by.y="Trap.site", all=TRUE)
Now let’s save this file for later.
#write the file to csv
write.csv(final_CameraRecords, "SnowLeopard_CameraTrap.csv")
Challenge: Renaming Files and changing CRS
Answer the following questions:
What would we do for renaming our files if we had A and B camera stations or species names subfolders? What would our code look like
What if our data were in the Namdapha National Park? What CRS would we use, and how would we code this in R?
Answers
The imageRename function in the camtrapR function allows for subdirectory folders to be organized separately. See the help for imageRename function ?imageRename to find out the station directories can have subdirectories “inDir/StationA/Camera1” to organize two cameras per station.
The Namdapha National Park is in Northeastern India, and is WGS 1984 UTM Zone 4N the EPSG code is 32604 “+init=EPSG:32643”
Key Points
Load camera trap data into R with the camtrapR package
Rename photos according to trap location and date, then copy to a new folder
When character strings between two dataframes do not match the str_replace() function can replace or change parts of the strings for a column in a dataframe
Spatial objects can be projected using the st_transform() function