A short post this time around, with loads of graphs and a very current and relevant topic, but with absolutely nothing to do with Structural Engineering.
I have been thinking of doing something with open data kinda for a while now, and got triggered again by all the discussions about covid.
Online data
It didn't even require a lot of research, as it happens. Searching for "covid open data" got me a link to a (Dutch) government website with an overview of links to publicly available data sources concerning covid data. From there it was just a few clicks to the European Center for Disease Control (ECDC), which keeps a day-to-day overview of covid related data.
Sidenote: in this search I stumbled on the website Our World in Data. They have interesting data stories on all kinds of topics, of which https://ourworldindata.org/coronavirus and https://ourworldindata.org/coronavirus-testing are just two - covid related - examples with some very cool infographics! Don't expect me to produce similar kinds of graphs in this blog (yet)! ;-)
Retrieving the data
For the Python scripting I'm using Jupyterlab again, just like in my previous (Structural Engineering related) blog. Retrieving the data itself requires very little effort, just some importing of data packages and reading a URL - from an online CSV data file to a Pandas dataframe:
Converting the data
The data types of the different columns are not usable yet, and some columns are not required (in my case). The dateRep column is converted to an actual Date/Time data type, some columns are dropped (using an array, since that'll keep the script kinda flexible) and others are renamed for ease-of-use (who came up with the original column names?):
Plotting the data
Finally we get to plot the data. I wrote the plot data in such a way, that I can easily show my own collection of data. This is done by means of the "data_plot", "c_plot" and "emphasize" parameters in the code below:
- Dutch government site with links to data sources on covid: https://www.digitaleoverheid.nl/overzicht-van-alle-onderwerpen/nieuwe-technologieen-data-en-ethiek/het-led/led-nieuws/dossierpostcontext/covid-19-in-data/
- Dutch Institute for Public Health and Environment (RIVM) overview of data sources related to covid: https://www.databronnencovid19.nl/
- Google mobility data during the COVID pandemic: https://www.google.com/covid19/mobility/
- Dutch Bureau for Statistics (CBS) open data (general): https://opendata.cbs.nl/
No comments:
Post a Comment