Category Archives: Data engineering

What I think data engineering is (revisited)

Four years now I’ve been working as a data engineer. And when I started writing about how to enter this field (because people sometimes ask me), I found out it’s beter to start writing about what data engineering actually is. … Continue reading

Posted in Data engineering | Tagged , | Leave a comment

Tech dossier: pandas

I’m keeping tech dossiers in Evernote on open source products I want to keep track of.  And I decided to put them on my blog. My previous ones were on Kubernetes and Elasticsearch. This one is on the Python data … Continue reading

Posted in Data engineering, Python, Tech dossier | Tagged , , , , , | Leave a comment

Book review: Spark in Action, 2nd edition

There are lots of books on Spark, but not a lot that aimed at the data engineer. Data engineers use Spark to ingest and transform data, which is different from what data scientists use it for. On the Roaring Elephant … Continue reading

Posted in Data engineering, Spark | Tagged , , , | 2 Comments