Tag Archives: Docker

I built a working Hadoop-Spark-Hive cluster on Docker. Here is how.

TL;DR: I made a Docker compose that runs Hadoop, Spark and Hive in a multi-container environment. You can find the necessary files for it here: https://github.com/Marcel-Jan/docker-hadoop-spark   How it started We at DIKW are working on a Certified Data Engineering … Continue reading

Posted in Howto, Learning Big Data, Spark | Tagged , , , , , , | 6 Comments

Dataworks Summit Berlin 2018, day two

Back for round two of keynotes, good technical sessions and discussing them with fellow data specialists in between. Keynotes First up was  Frank Säuberlich from Teradata, who had an interesting example of machine learning for fraud detection at Danske Bank. … Continue reading

Posted in Conferences, Events | Tagged , , , , , , , , , , , | Leave a comment