-
Recent Posts
Recent Comments
- Marcel-Jan Krijgsman on I built a working Hadoop-Spark-Hive cluster on Docker. Here is how.
- Chris on I built a working Hadoop-Spark-Hive cluster on Docker. Here is how.
- admin_r0g1nuq9 on I built a working Hadoop-Spark-Hive cluster on Docker. Here is how.
- LJ on I built a working Hadoop-Spark-Hive cluster on Docker. Here is how.
- admin_r0g1nuq9 on I built a working Hadoop-Spark-Hive cluster on Docker. Here is how.
Archives
Categories
Meta
-
Privacy & Cookies: This site uses cookies. By continuing to use this website, you agree to their use.
To find out more, including how to control cookies, see here: Cookie Policy
Tag Archives: Spark
Book review: Spark in Action, 2nd edition
There are lots of books on Spark, but not a lot that aimed at the data engineer. Data engineers use Spark to ingest and transform data, which is different from what data scientists use it for. On the Roaring Elephant … Continue reading
Posted in Data engineering, Spark
Tagged Apache Spark, Jean-Georges Perrin, Roaring Elephant podcast, Spark
2 Comments
Dataworks Summit Berlin 2018, day two
Back for round two of keynotes, good technical sessions and discussing them with fellow data specialists in between. Keynotes First up was Frank Säuberlich from Teradata, who had an interesting example of machine learning for fraud detection at Danske Bank. … Continue reading
Posted in Conferences, Events
Tagged Apache Atlas, Apache Metron, Apache Ranger, Data Steward Studio, Dataworks Summit, Docker, GDPR, Personal data, Roaring Elephant podcast, Spark, Synerscope, TPC-H
Leave a comment