Spatial Analysis I

Mari Sakamoto
4 min readJan 15, 2022

Let's explore in baby steps with R.

Photo by Mael BALLAND on Unsplash

What is Spatial Data?

To say the basic, Spatial Data shows how much a certain event changes keeping in mind where it happened. While non-spatial data doesn’t care about the variation over space.

First Law of Geography

Everything is connected to another, but nearby things are more connected than distant things. — Tobler, 1970

There can be a great range of spatial analysis themes, for example, crime rate over a district, electoral votes by state, poverty by neighbourhood, weather forecasting, etc. In the map below, it’s possible to get insights into vaccine coverage in each country.

How to organize your data?

Usually, you will need to organize your data tidy and wide, which means your variables will be in columns and the data in rows.

Also, we will need a geographic coordinate reference system (CRS), to attach the data to a specific place. This can be the latitude (distance to the Equatorian Line, 90º South to North) and longitude (distance to the Greenwich Meridian, 180º West to East).

Note: To convert 23º 40' 30,5'’ W you need to divide seconds to 3600 and minutes to 60, then you will get -23,6751. When it is South or West, it is negative. Or you can use the package measurements.

Shapefiles Objects

Shapefiles are files commonly used in GIS software that holds information on geography, including location and shape (Lansey and Chesire, 2018).

File extensions:

  • .shp: file contains geometry, which means the polygons that create a map
  • .dbf: file contains the database
  • .shx: connects the .shp and .dbf files
  • .prj: describes the geographyc system that the map uses

Simple feature Objects

Simple feature objects are data frames containing an array of GIS coordinates, according to Pebesma (2018).

That means, that any database containing geographic coordinates in its observations can become a simple feature object. In order to do so, the st_as_sf() from the sf package can be used.

This comes as a great advantage because you don’t need a complex shapefile with all the shp, dbf, prj,… extensions to work with spatial analysis. Additionally, you can use tmap to create layers on the simple feature data and get some insights from free and online maps (i.e. Open Maps and ESRI).

Hands-On with R

1. First, install those packages

2. Reading a shapefile

To read the shapefile, use the “readOGR” function, it takes the folder name as the “dsn” argument and for the layer, use the file name (i.e. estado_sp.shp).

In this case, we are reading a map from São Paulo State, located in Brazil withholding the name and code number of the 643 cities in the state and its geographic borders.

We also added a new variable containing the approximated area of each city.

3. Adding external information to your spatial data

To bring more information about the cities, we merged another database with the Human Development Index (HDI) of each city.

4. Plotting your spatial data

4.1. Using ggplot and ggplotly package

The ggplot is a popular package to display graphs in R and it can also be used to work with spatial data. When combined with ggplotly, it can create animated plots, in this case, it was used to make the name and HDI index available when the user hovers the mouse in each city.

HDI Index by each city in São Paulo

4.2. Using tmap package

The tmap package makes plotting spatial data even easier. In this code, we are plotting the same information as the previous example, but we are changing the colours by quartile and adding a histogram and title to the map.

5. Extracting some areas of the shapefile

In this example, we want to extract South America from a world map shapefile.

6. Binding shapefiles

In this other example, we want to join different shapefiles by simply using the bind function.

7. Working with simple feature objects

Firstly, we loaded a data frame containing some shopping centres names, addresses and coordinates and then converted it to s simple feature object.

Now that we have a simple feature containing the shopping centres coordinates and regions, we want to plot their location and colour them by region.

Also, to activate the layers to add context to the data, you simply need to write tmap_mode(“view”) and to deactivate it tmap_mode(“plot”).

Finally, we combined a shapefile of São Paulo city to the plot.

The final result is a map with layers from the Esri online map, a layer of São Paulo borders from the shapefile and shopping centres data from the simple feature object.

Credits:

This content is based on my class notes from USP ESALQ MBA Data Science & Analytics class.

You can check my repository here: https://github.com/marisakamoto/DataScienceMBA/tree/main/03_SpatialAnalysis

You may want to read Spatial Analysis II, on spatial Points and Raster objects. Even doing a 3D plot!

--

--

Mari Sakamoto

Hi! I am a MBA Data Science Candidate from Brazil. Here is my class notes and learning discoveries.