Monthly Archives: March 2022

Handling far future dates in pandas

Recently I got the request to add specific data quality metadata with csv datasets that my client delivers to customers. It was very simple. Just counts, min, max and -in case of integers – sums per attribute. Not a difficult … Continue reading

Posted in Python | Tagged , , , , , | Leave a comment

My Github repo got 50 stars

I never imagined myself as a maintainer of a data engineering related open source thing. Yet. But when I was working on our data engineering course, I needed some kind of data lake software. At first I used the Cloudera … Continue reading

Posted in Apache Products for Outsiders, Data engineering, Learning Big Data | Tagged , , , , | Leave a comment

Five years of data engineering

Five years ago I made the switch from Oracle database administration to data engineering. It has been quite a ride. I made a video about this to celebrate.

Posted in Active Learning, Data engineering, Learning Big Data | Tagged | Leave a comment