API: eurostat

Access statistics at European level through the Eurostat API.

Table of Contents


By using SKEMA Quantum Studio framework (Warin 2019), this course will teach you how to use the eurostat package.

Database description

Eurostat is the statistical office of the European Union. While statistic authorities in Member States collect and analyse data, Eurostat’s role is to consolidate the data and ensure they are comparable. It provides statistics at European level that enable comparisons between countries and regions. From EU policies, economy and finance to social conditions and environment, Eurostat is a powerful tool that consolidate the data using a harmonized methodology.

Eurostat: https://ec.europa.eu/eurostat/fr/about/overview

Functions

Each of these functions are detailed in this course and some examples are provided.

get_eurostat_toc()

The function get_eurostat_toc() downloads a table of contents of eurostat datasets.


# Load the package
library(eurostat)
library(rvest)

# Get Eurostat data listing
toc <- get_eurostat_toc()
title code type last update of data last table structure change data start data end values
Database by themes data folder NA NA NA NA NA
General and regional statistics general folder NA NA NA NA NA
European and national indicators for short-term analysis euroind folder NA NA NA NA NA
Business and consumer surveys (source: DG ECFIN) ei_bcs folder NA NA NA NA NA
Consumer surveys (source: DG ECFIN) ei_bcs_cs folder NA NA NA NA NA
Consumers - monthly data ei_bsco_m dataset 08.01.2020 08.01.2020 1980M01 2019M12 NA

search_eurostat()

With search_eurostat() you can search the table of contents for particular patterns, e.g. all datasets related to passenger transport. Note that with the type argument of this function you could restrict the search to for instance datasets or tables.


# info about passengers
search_eurostat("passenger transport")
title code type last update of data last table structure change data start data end values
Volume of passenger transport relative to GDP tran_hv_pstra dataset 12.09.2019 12.09.2019 2000 2017 NA
Modal split of passenger transport tran_hv_psmod dataset 09.09.2019 09.09.2019 1990 2017 NA
Air passenger transport by reporting country avia_paoc dataset 23.12.2019 07.11.2019 1993 2019Q3 NA
Air passenger transport by main airports in each reporting country avia_paoa dataset 23.12.2019 04.12.2019 1993 2019Q3 NA
Air passenger transport between reporting countries avia_paocc dataset 23.12.2019 07.11.2019 1993 2019Q3 NA
Air passenger transport between main airports in each reporting country and partner reporting countries avia_paoac dataset 23.12.2019 07.11.2019 1993 2019Q3 NA

Once you have found the datasets you are looking for, you can insert the specific id of the dataset in a variable of your choice.


id <- search_eurostat("Modal split of passenger transport", 
                         type = "table")$code[1]
print(id)

[1] "t2020_rk310"

get_eurostat()

The function get_eurostat takes as an input the specific id of the dataset. It returns datas from the dataset The str() function allows you to investigate the structure of the downloaded data set.


dat <- get_eurostat(id)
str(dat)

Classes 'tbl_df', 'tbl' and 'data.frame':   2587 obs. of  5 variables:
 $ unit   : Factor w/ 1 level "PC": 1 1 1 1 1 1 1 1 1 1 ...
 $ vehicle: Factor w/ 3 levels "BUS_TOT","CAR",..: 1 1 1 1 1 1 1 1 1 1 ...
 $ geo    : Factor w/ 34 levels "AT","BE","CH",..: 1 2 3 4 5 6 7 8 9 10 ...
 $ time   : Date, format: "1990-01-01" ...
 $ values : num  11 10.6 3.7 9.1 11.3 32.4 14.9 13.5 6 24.8 ...
unit vehicle geo time values
PC BUS_TOT AT 1990-01-01 11.0
PC BUS_TOT BE 1990-01-01 10.6
PC BUS_TOT CH 1990-01-01 3.7
PC BUS_TOT DE 1990-01-01 9.1
PC BUS_TOT DK 1990-01-01 11.3
PC BUS_TOT EL 1990-01-01 32.4

It is possible to add filters to only have a specific part of the dataset.

By default variables are returned as Eurostat codes, but to get human-readable labels instead, use a type = “label” argument.


datl <- get_eurostat(id, filters = list(geo = c("EU28", "FI"), 
                                         lastTimePeriod = 1), 
                      type = "label", time_format = "num")
unit vehicle geo time values
Percentage Motor coaches, buses and trolley buses European Union - 28 countries 2017 8.8
Percentage Motor coaches, buses and trolley buses Finland 2017 10.4
Percentage Passenger cars European Union - 28 countries 2017 83.3
Percentage Passenger cars Finland 2017 84.2
Percentage Trains European Union - 28 countries 2017 7.9
Percentage Trains Finland 2017 5.4

As we can see, we now have the percentage value of transport utilisation for the Finland compare to the rest of the European Union in 2017.

tl;dr


# Load the package
library(eurostat)
library(rvest)

# Get Eurostat data listing
toc <- get_eurostat_toc()

# Info about passengers
kable(head(search_eurostat("passenger transport")))
#id of the dataset
id <- search_eurostat("Modal split of passenger transport", 
                         type = "table")$code[1]
#Raw data 
dat <- get_eurostat(id)
str(dat)

# Filters addition
datl <- get_eurostat(id, filters = list(geo = c("EU28", "FI"), 
                                         lastTimePeriod = 1), 
                      type = "label", time_format = "num")

Code learned this week

Command Detail
get_eurostat_toc() Downloads a table of contents of eurostat datasets
search_eurostat() search the table of contents for particular patterns
get_eurostat() Read eurostat data from a specfic id of a dataset

References

This course uses the Eurostat Tutorial


Warin, Thierry. 2019. “SKEMA Quantum Studio: A Technological Framework for Data Science in Higher Education.” https://doi.org/10.6084/m9.figshare.8204195.v2.

Citation

For attribution, please cite this work as

Warin (2020, Jan. 28). Virtual Campus: API: eurostat. Retrieved from https://virtualcampus.skemagloballab.io/posts/apieurostat/

BibTeX citation

@misc{warin2020api:,
  author = {Warin, Thierry},
  title = {Virtual Campus: API: eurostat},
  url = {https://virtualcampus.skemagloballab.io/posts/apieurostat/},
  year = {2020}
}