Tuesday, 1 September 2020

Working with Open Data

A short post this time around, with loads of graphs and a very current and relevant topic, but with absolutely nothing to do with Structural Engineering. 

I have been thinking of doing something with open data kinda for a while now, and got triggered again by all the discussions about covid.


Online data

It didn't even require a lot of research, as it happens. Searching for "covid open data" got me a link to a (Dutch) government website with an overview of links to publicly available data sources concerning covid data. From there it was just a few clicks to the European Center for Disease Control (ECDC), which keeps a day-to-day overview of covid related data.

Sidenote: in this search I stumbled on the website Our World in Data. They have interesting data stories on all kinds of topics, of  which https://ourworldindata.org/coronavirus and https://ourworldindata.org/coronavirus-testing are just two - covid related - examples with some very cool infographics! Don't expect me to produce similar kinds of graphs in this blog (yet)! ;-)

Retrieving the data

For the Python scripting I'm using Jupyterlab again, just like in my previous (Structural Engineering related) blog. Retrieving the data itself requires very little effort, just some importing of data packages and reading a URL - from an online CSV data file to a Pandas dataframe:


As you can see, it's easy as pie to get to this data. Of course, it will need so work to be presentable :-)

Converting the data

The data types of the different columns are not usable yet, and some columns are not required (in my case). The dateRep column is converted to an actual Date/Time data type, some columns are dropped (using an array, since that'll keep the script kinda flexible) and others are renamed for ease-of-use (who came up with the original column names?):



As you can see in the bottom part of the screenshot above, this results in a nice little table with data to work with. Lastly, to be able to display results per country, I've grouped it by country:



Plotting the data

Finally we get to plot the data. I wrote the plot data in such a way, that I can easily show my own collection of data. This is done by means of the "data_plot", "c_plot" and "emphasize" parameters in the code below:


Now I can, for instance, plot the 14 day cumulative number of COVID cases for the Netherlands, Belgium, France and Spain, with an emphasis (bold line) for Netherlands and Belgium:


Or we could calculate our own 7 day and 14 day rolling average daily confirmed cases for some of the countries that were hit hardest by the current pandemic (ignoring the last few data steps, I still need to study a bit on the rolling functions for Pandas I think):



As you can see, the combination of Python (in Jupyterlab) and open data is amazing! Almost real-time (open) data at your disposal with just a few lines of code.

Not sure how tot use it in my daily structural engineering work, but we'll find a good use for it! :-) If you have good suggestions, feel free to post them below.

FYI, some other interesting open data links I came across:

Link to my Github gist for this script:

Monday, 1 June 2020

Python for number crunching

We've all been there: building an extensive 3D FEM model for a project, tens of thousands of beams/columns, everything seems to work nicely... and then you need to start crunching the numbers to see for instance if all the concrete penants and lintels can be reinforced. Or even more labour-intensive, for all those 1D elements the reinforcement needs to actually be designed and detailed. Where to start, right?

The result

This is what we we like to get: all sections of all 1D elements from a specific section type, checked against a given reinforcement layout in the section:


Now, how to get there?

Python & Pandas

I've got one such project at hand, and I was thinking on how to make this as easy as possible for myself. Since I have been playing with Python I have come across the Pandas framework for Python, but I hadn't really used it yet, other than to easily layout tables. The proof of the pudding is in the eating, so I thought of using this weekend to try this out some more.

I know SCIA is working on an API for their FEM package SCIA Engineer, but for this case I went the old-fashioned way: I exported the 1D element information to Excel. The result: approximately 3400 1D elements, with a total of about 25.500 section force combinations:


As a sidenote: to make things easy for yourself, choose to have separate tables exported to separate worksheets ;-) Otherwise you'll have all data in one sheet, which will make life much more difficult.

For implementation of Python and Pandas, I'm still using Jupyterlab. Pandas is included as standard in Jupyterlab's impementation in Anaconda, so I just have to put the Excel file in a separate folder within the Jupyterlab notebook folder, and we can get started on the number crunching!

Reading and preparing

Importing the Excel to Python with Pandas is actually quite easy:


As you can see in the screenshot above, it already distinguished between the 2 worksheets in the Excel document.

Next we parse both sheets to their respective datasheets:


The "Staven" (1D element) sheet doesn't require a lot of additional processing, since the exported data is already very lean; the .dropna() command is only there just in case. The "Interne 1D-krachten" (Internal 1D forces) sheet has some polution of the table: the first few lines are skipped, and rows with non-matching values (resulting in NaN values on their rows) are removed using the .dropna() command. The resulting tables already look quite clean and usable! (The .head(5) command is used to show just the first 5 elements of the dataframes)

Last but not least, the results of both tables are combined and grouped based on the section ("Doorsnede"). For those of you not familiar with SQL-esque database systems: using the "how='outer'" join type for combining these dataframes ensures that every row with a result of a particular 1D element gets the right name, section, etc. assigned:


Interesting bit: the tables are imported with all columns as string types at first. By running the .to_numeric(<...>, errors=''ignore) function, the records that can be converting to numbers, will be.

Drilling down on results

After this and some matplotlib magic with some for-loops, the section forces for all 1D elements in the model can be visualised, grouped and colored by section type:



This is still not very usable though; it would be better to (1) have the section forces grouped in graphs per section type, and (b) visualise them in relation to the section capacity.

First, we select the section to analyse:



Then we determine the section capacity. Fortunately, I made a script in Python before to calculate the capacity at a given concrete section design, see this previous blog again for some more info. The script can be turned into a function to be re-used with little effort (MNCapCurve() in the screenshot below), which makes it possible to automate the capacity calculation with some extra input for the concrete cross section to calculate:


You can use some code here to set the parameters for plotting the graphs, like using the same axes scales for Y- and Z-direction of the sections, or collecting the 1D elements with the governing section forces for annotion in the graphs. 

And finally you can plot the section forces from the FEM model against the section capacity, to be used to check if your model is working right, if the (element) design is feasible, or even to produce an automated detailed design on these elements:


That's it! I still have to change the "elem_nr" manually to generate the capacity check for the next section. Generating capacity checks for all sections included in the FEM model is definitely next on the ToDo list!

I hope you enjoyed this little exercise in data crunching and automation with FEM output. Let me know what you think in the comments!