Category Archives: Data engineering

What a year 2021 has been

So at the end of 2021 I found myself in the waiting room of an emergency dentist. An infection above my front teeth became unbearable. Fortunately antibiotics makes my live much better now. Let that event not colour my view … Continue reading

Posted in Active Learning, Data engineering | Tagged , , , , , , , | Leave a comment

What I think data engineering is (revisited)

Four years now I’ve been working as a data engineer. And when I started writing about how to enter this field (because people sometimes ask me), I found out it’s beter to start writing about what data engineering actually is. … Continue reading

Posted in Data engineering | Tagged , | Leave a comment

Tech dossier: pandas

I’m keeping tech dossiers in Evernote on open source products I want to keep track of.  And I decided to put them on my blog. My previous ones were on Kubernetes and Elasticsearch. This one is on the Python data … Continue reading

Posted in Data engineering, Python, Tech dossier | Tagged , , , , , | Leave a comment

Book review: Spark in Action, 2nd edition

There are lots of books on Spark, but not a lot that aimed at the data engineer. Data engineers use Spark to ingest and transform data, which is different from what data scientists use it for. On the Roaring Elephant … Continue reading

Posted in Data engineering, Spark | Tagged , , , | 2 Comments