RGIS 3.0
Arthur de Grandpré et Lisane Arsenault-Boucher
28/01/2021
- Context
- Version 3.0
- R as a GIS and mapping software
- Exercices
- Step 1. Prepare the R environment and install/load the libraries
- Step 2. Reading and visualising point data + adding a basemap
- Step 3. Using leaflet to create interactive maps
- Step 4. Loading and plotting vectors and rasters
- Step 5. Extracting data from rasters
- Step 6. Sharing your awesome map
- Step 7. Sharing spatial data
- Extras
Context
Most would agree that data science should always start and end with data visualization. It allows scientists to understand the data structure and distribution, but also to connect with the public, by creating easily understandable content that facilitates communication.
In the previous workshops, we have learned how to:
- Efficiently visualize data and results
- Manipulate data to facilitate its analysis
- Generate data efficiently in order to replicate real life trends
… and all of this makes it hard to leave the R environment! Most of us still are leaving the comfort of R whenever we are faced with spatial data, splitting our workflows and creating different scripts, different processes and ultimately resulting in a loss of time and efficiency.
If you’re willing to dig deep enough, you’ll find out that R can be used as part of a high performance GIS workflow for many applications, especially when combined with some external programs such as QGIS, GRASS and GDAL.
The main goal of this workshop is to break the mindset in which R is bad at dealing with spatial data or spatial analysis.
It is important to keep in mind that this is only an introduction and that it is possible to do almost anything with R, and in many different ways.
Also, it might be even more important to keep in mind that in many situations, while R could do it, you might want to consider other solutions. Using R for everything is fun and can be practical, but sometimes it’s just more tedious than necessary.
Version 3.0
Previous versions of this workshop were mostly focused on understanding basic data structure for spatial objects from the “sp” package, and how they can be used for basic mapping and building interactive maps using leaflet.
This version will shift towards a more up-do-date approach to spatial data in R using the “sf” package instead of “sp”. sf is faster, integrates more functions, and is compatible with tidy writing. This workshop will also put more focus on producing actually publishable static maps, presenting interactive maps as a collaboration tool.
R as a GIS and mapping software
Multiple libraries have been made available in R in order to deal with spatial data, its visualization, its manipulation and its analysis.
The most notable are probably the following:
GIS
sp : allow the creation and manipulation of spatial objects (spatialpointsdataframes) + apply CRS and transformations sf : alternative environment to sp, integrating more functions (rgeos and rgdal) and compatible with tidy writing rgdal : An interface to access gdal, a spatial data processing library
rgeos : An interface to access geos, a spatial vector data processing library raster : Allows to read and manipulate raster data. Very important!
mapping
ggmap : extends ggplot2 mapping functionalities and is able to retrieve basemaps
leaflet : allows to build interactive maps, Google map style, easy to share
ggsn : add North arrow and scalebar to ggmap
Exercices
NOTE: All data for this exercise are located in the repository under the “Data” sub-directory, available here: https://github.com/arthurdegrandpre/NumeriLab_RGIS/tree/2021
Step 1. Prepare the R environment and install/load the libraries
install.packages("sp")
install.packages("sf")
install.packages("tidyverse")
install.packages("ggmap")
install.packages("leaflet")
install.packages("mapview")
install.packages("raster")
install.packages("leafsync")
install.packages("htmlwidgets")
install.packages("ggsn")
install.packages("rgdal", type = "source") # requires an installation of Rtools, see https://cran.r-project.org/bin/windows/Rtools/
webshot::install_phantomjs() # install only once
Step 2. Reading and visualising point data + adding a basemap
Multiples solutions are available to read point data in the R environment depending on your database format. The most common data format is text, such as .csv files. When working with .csv files, you will often have two columns referring to the coordinates of your data, which can be used create a spatial object with the packages sf or sp.
If you have a shapefile, it can be imported directly as a spatial object by using the function st_read from sf or readOGR() from the rgdal package. The later will generate an sp compatible object.
df = read.csv("../Data/csv/data_picom_HT.csv",sep=";") # contains forestry data for the UQTR campus, with lon/lat coordinates
head(df)
## zone parcelle zone_eco lat lon sup_zone_ha sup_parc_ha
## 1 18 18A jeune pinede blanche 46.34558 -72.58600 0.213 0.04
## 2 4 4A jeune pinede blanche 46.34992 -72.57578 0.552 0.04
## 3 13 13A mixte 46.34743 -72.58229 1.441 0.04
## 4 17 17A mixte 46.34656 -72.58300 2.789 0.04
## 5 17 17B mixte 46.34653 -72.58397 2.782 0.04
## 6 17 17C mixte 46.34597 -72.58511 2.782 0.04
## tot_drymass aerial_drymass root_drymass n_trees total_C n_living_trees
## 1 8861 6877 1984 95 4735 84
## 2 11379 8896 2483 50 6236 50
## 3 6035 4692 1343 51 3146 38
## 4 13721 10766 2955 57 7651 42
## 5 9865 7726 2140 40 5397 29
## 6 10449 8200 2250 39 5727 34
## mean_dhp max_dhp mean_height dom_sp est_age
## 1 14.9 29.0 14.1 PIB 45
## 2 23.5 33.9 13.6 PIB 45
## 3 16.6 28.9 13.6 BOP 45
## 4 22.2 79.1 16.9 PIB/PIG 45
## 5 23.5 42.4 18.3 PIB 45
## 6 22.8 49.2 15.1 PIB 45
All functions from sf are called with the prefix “st_”. “st_as_sf()” can be used to transform a foreign object into a sf object, followed by “st_set_crs()” to define coordinate system. The data is in NAD83, which corresponds to epsg code 4269.
## Simple feature collection with 36 features and 16 fields
## Geometry type: POINT
## Dimension: XY
## Bounding box: xmin: -72.58601 ymin: 46.34263 xmax: -72.57278 ymax: 46.35089
## Geodetic CRS: NAD83
## First 10 features:
## zone parcelle zone_eco sup_zone_ha sup_parc_ha tot_drymass
## 1 18 18A jeune pinede blanche 0.213 0.04 8861
## 2 4 4A jeune pinede blanche 0.552 0.04 11379
## 3 13 13A mixte 1.441 0.04 6035
## 4 17 17A mixte 2.789 0.04 13721
## 5 17 17B mixte 2.782 0.04 9865
## 6 17 17C mixte 2.782 0.04 10449
## 7 17 17D mixte 2.782 0.04 6449
## 8 20 20A mixte suranne 1.761 0.04 13441
## 9 20 20B mixte suranne 1.761 0.04 20148
## 10 20 20C mixte suranne 1.761 0.04 13799
## aerial_drymass root_drymass n_trees total_C n_living_trees mean_dhp max_dhp
## 1 6877 1984 95 4735 84 14.9 29.0
## 2 8896 2483 50 6236 50 23.5 33.9
## 3 4692 1343 51 3146 38 16.6 28.9
## 4 10766 2955 57 7651 42 22.2 79.1
## 5 7726 2140 40 5397 29 23.5 42.4
## 6 8200 2250 39 5727 34 22.8 49.2
## 7 5023 1426 49 3425 41 17.2 35.2
## 8 10583 2858 28 7375 27 23.8 55.4
## 9 15939 4209 18 11114 17 39.9 72.2
## 10 10865 2934 33 7316 32 22.6 68.8
## mean_height dom_sp est_age geometry
## 1 14.1 PIB 45 POINT (-72.586 46.34558)
## 2 13.6 PIB 45 POINT (-72.57578 46.34992)
## 3 13.6 BOP 45 POINT (-72.58229 46.34743)
## 4 16.9 PIB/PIG 45 POINT (-72.583 46.34656)
## 5 18.3 PIB 45 POINT (-72.58397 46.34653)
## 6 15.1 PIB 45 POINT (-72.58511 46.34597)
## 7 13.0 PIB/ERR 45 POINT (-72.58475 46.3465)
## 8 17.1 Mixte 50 POINT (-72.58601 46.34511)
## 9 20.0 Mixte 70 POINT (-72.58511 46.34481)
## 10 16.5 Feuillu 55 POINT (-72.5843 46.34298)
It is very easy to transform the CRS of an object by using the st_transform function. It can be done my manually by entering the datum, ellipsoid and projection, but the simplest way is to just feed the EPSG code to the function. EPSG codes are unique identifiers for CRS and can be found easily online. Let’s transform to WGS84 (EPSG 4326).
## Simple feature collection with 36 features and 16 fields
## Geometry type: POINT
## Dimension: XY
## Bounding box: xmin: -72.58601 ymin: 46.34263 xmax: -72.57278 ymax: 46.35089
## Geodetic CRS: WGS 84
## First 10 features:
## zone parcelle zone_eco sup_zone_ha sup_parc_ha tot_drymass
## 1 18 18A jeune pinede blanche 0.213 0.04 8861
## 2 4 4A jeune pinede blanche 0.552 0.04 11379
## 3 13 13A mixte 1.441 0.04 6035
## 4 17 17A mixte 2.789 0.04 13721
## 5 17 17B mixte 2.782 0.04 9865
## 6 17 17C mixte 2.782 0.04 10449
## 7 17 17D mixte 2.782 0.04 6449
## 8 20 20A mixte suranne 1.761 0.04 13441
## 9 20 20B mixte suranne 1.761 0.04 20148
## 10 20 20C mixte suranne 1.761 0.04 13799
## aerial_drymass root_drymass n_trees total_C n_living_trees mean_dhp max_dhp
## 1 6877 1984 95 4735 84 14.9 29.0
## 2 8896 2483 50 6236 50 23.5 33.9
## 3 4692 1343 51 3146 38 16.6 28.9
## 4 10766 2955 57 7651 42 22.2 79.1
## 5 7726 2140 40 5397 29 23.5 42.4
## 6 8200 2250 39 5727 34 22.8 49.2
## 7 5023 1426 49 3425 41 17.2 35.2
## 8 10583 2858 28 7375 27 23.8 55.4
## 9 15939 4209 18 11114 17 39.9 72.2
## 10 10865 2934 33 7316 32 22.6 68.8
## mean_height dom_sp est_age geometry
## 1 14.1 PIB 45 POINT (-72.586 46.34558)
## 2 13.6 PIB 45 POINT (-72.57578 46.34992)
## 3 13.6 BOP 45 POINT (-72.58229 46.34743)
## 4 16.9 PIB/PIG 45 POINT (-72.583 46.34656)
## 5 18.3 PIB 45 POINT (-72.58397 46.34653)
## 6 15.1 PIB 45 POINT (-72.58511 46.34597)
## 7 13.0 PIB/ERR 45 POINT (-72.58475 46.3465)
## 8 17.1 Mixte 50 POINT (-72.58601 46.34511)
## 9 20.0 Mixte 70 POINT (-72.58511 46.34481)
## 10 16.5 Feuillu 55 POINT (-72.5843 46.34298)
At this point, R recognizes the data as spatial data and will plot it as such if prompted to. The plot can then be manipulated into a simple data representation using base R. (More interesting with polygons… see https://r-spatial.github.io/sf/articles/sf5.html)
## Warning: plotting the first 9 out of 16 attributes; use max.plot = 16 to plot
## all
plot(dfs["tot_drymass"],
key.pos = 1, #(1=below, 2=left, 3=above and 4=right)
axes = T,
key.width = lcm(1.3),
key.length = 1.0,
)
plot(dfs["tot_drymass"],
key.pos = 1, #(1=below, 2=left, 3=above and 4=right)
axes = T,
key.width = lcm(1.3),
key.length = 1.0,
breaks = "jenks")
and the same can be done using ggplot2:
We can now see that we have spatial points with some values, but that does not give us much of an idea of what they are and their context. For adding context, a basemap is really important.
Note: Until recently, get_map from the package ggmap allowed to easily obtain basemaps from Google and OpenStreetMaps. Sadly, Google now requires an API key and billing information to access the basemaps, and it also broke the access to open access maps such as Stamen. That makes loading a static basemap a bit harder.
While it is still possible to access Google maps and OpenStreetMaps, it is required to register an API key (which can generate costs for very heavy users) .
https://www.r-bloggers.com/geocoding-with-ggmap-and-the-google-api/ for Google API
# download a basemap for our data using get_stamenmap, and the bounding box in "sp" format. adding a small buffer around the bounding box prevents the basemap from being too limited in it's spatial extent
map = get_stamenmap(bbox=bbox(as_Spatial(dfs))+c(-0.001,-0.001,0.001,0.001),
maptype = "terrain",
zoom = 15)
# other basemap types from stamen:
#“terrain”, “terrain-background”, “terrain-labels”, “terrain-lines”, “toner”, “toner-2010”, “toner-2011”, “toner-background”, “toner-hybrid”, “toner-labels”, “toner-lines”, “toner-lite”, “watercolor”
# using ggmap with the ggplot2 syntax allows to create complete static maps with relative ease
map1 = ggmap(map) +
geom_sf(data = dfs,
aes(fill = tot_drymass),
size = 3,
shape = 21,
inherit.aes=F
) +
labs(
title = "Estimated forest drymass on UQTR campus, 2016",
# subtitle = "Subtitle",
caption = "Basemap = Stamen Terrain, CRS = EPSG:4326 WGS84, produced 2021-02-02",
tag = "a)",
fill = "Tot. Drymass (kg)",
x = "Longitude",
y = "Latitude"
)+
theme(
axis.text.x = element_text(angle = 45, hjust=1),
axis.text.y = element_text(angle = 45, hjust=1),
plot.caption = element_text(hjust = 0.5),
plot.title.position = "plot"
)+
north(dfs,scale=0.2)+
scalebar(dfs, dist = 250, dist_unit = "m",
transform = TRUE, model = "WGS84", st.bottom = TRUE, st.dist = 0.05)
map1
We are planning a more advanced workshop about R and static maps in the future, but the focus of this workshop is about interactive maps, more specifically using the Leaflet package.
Step 3. Using leaflet to create interactive maps
Luckily, leaflet is still able to call basemaps without registering an API key.
By default, leaflet uses OpenStreetMap.
But it can be set to call other data sources, including Google if an API key is registered.
It possesses different types of markers, such as pins.
Or circles.
And those can be customized to display data values with continuous scales.
pal = colorNumeric(
palette = "RdYlGn",
domain = dfs$tot_drymass
)
leaflet(dfs) %>%
addTiles() %>%
addCircleMarkers(color = "black", opacity=0.5, fillColor=~pal(tot_drymass), fillOpacity = 0.5, radius=8) %>%
addLegend("bottomright",
pal=pal,
values=~tot_drymass,
title= "drymass (kg / 0.04 ha)")
Or by quantiles.
qpal = colorQuantile("RdYlGn", dfs$tot_drymass, n = 4)
leaflet(dfs) %>%
addTiles() %>%
addCircleMarkers(color = "black", opacity=0.5, fillColor=~qpal(tot_drymass), fillOpacity = 0.5, radius=8) %>%
addLegend("bottomright",
pal=qpal,
values=~tot_drymass,
title= "Drymass Quantile") %>%
addScaleBar(position="bottomleft")
Marker size can also be fixed to represent the size of a variable.
leaflet(dfs) %>%
addTiles() %>%
addCircles(color = "black", opacity=0.5, fillColor=~pal(tot_drymass), fillOpacity = 0.8, radius=dfs$n_trees) %>%
addLegend("bottomright",
pal=qpal,
values=~tot_drymass,
title= "Drymass Quantile") %>%
addScaleBar(position="bottomleft")
And then this map can be saved to a png or jpg format using the mapview package.
Step 4. Loading and plotting vectors and rasters
Often, we have polygons delimiting zones of interest for our data. They can be used for visualization, but also in data analysis.
##adding polygon
campus = st_read("../Data/shapefiles/campus polygon.shp") %>%
st_transform("epsg:4326")
## Reading layer `campus polygon' from data source `C:\Users\Arthur\Documents\rscripts\NumeriLab_RGIS\Data\shapefiles\campus polygon.shp' using driver `ESRI Shapefile'
## Simple feature collection with 1 feature and 1 field
## Geometry type: POLYGON
## Dimension: XY
## Bounding box: xmin: -8080357 ymin: 5804215 xmax: -8078634 ymax: 5805890
## Projected CRS: World_Mercator
leaflet(dfs) %>%
addTiles() %>%
addCircles(stroke=T, color="black", opacity=0.5, fillColor=~pal(dfs$tot_drymass), fillOpacity = 0.8, radius=dfs$n_trees) %>%
addLegend("bottomright",
pal=qpal,
values=~tot_drymass,
title= "values") %>%
addScaleBar(position="bottomleft") %>%
addPolygons(data=campus, stroke=T, fillOpacity = 0)
When working with spatial data, it can often be useful to access remote sensing data to perform different types of analysis.
The rest of the workshop will be mostly about working with such data in R, based on a sample of Sentinel-2 imagery.
Then, let’s load some raster that we can work with. To do so, the package “raster” contains most of the basic functions required for R to read and manage rasters. A raster is basically a spatially referenced matrix where each value is associated with a spatial extent (resolution) and a coordinate taken from a geodetic system (ex: WGS84).
# The raster function allows to read the most common raster formats, such as .TIFF
tr_r = raster("../Data/S2/True color.tiff")
## Warning in showSRID(uprojargs, format = "PROJ", multiline = "NO", prefer_proj
## = prefer_proj): Discarded ellps WGS 84 in CRS definition: +proj=merc +a=6378137
## +b=6378137 +lat_ts=0 +lon_0=0 +x_0=0 +y_0=0 +k=1 +units=m +nadgrids=@null
## +wktext +no_defs +type=crs
## Warning in showSRID(uprojargs, format = "PROJ", multiline = "NO", prefer_proj =
## prefer_proj): Discarded datum World Geodetic System 1984 in CRS definition