Monthly Archives: September 2019

Tech dossier: pandas

I’m keeping tech dossiers in Evernote on open source products I want to keep track of.  And I decided to put them on my blog. My previous ones were on Kubernetes and Elasticsearch. This one is on the Python data … Continue reading

Posted in Data engineering, Python, Tech dossier | Tagged , , , , , | Leave a comment

The Atlas REST API – working examples

Originally I was writing a blogpost about my experiences with Apache Atlas (which is still in the works) in which I would refer to a Hortonworks Community post I wrote with all the working examples of Atlas REST API calls. … Continue reading

Posted in Apache Atlas | Tagged , , , , , , , , , | 2 Comments

Book review: Spark in Action, 2nd edition

There are lots of books on Spark, but not a lot that aimed at the data engineer. Data engineers use Spark to ingest and transform data, which is different from what data scientists use it for. On the Roaring Elephant … Continue reading

Posted in Data engineering, Spark | Tagged , , , | 2 Comments