-
Recent Posts
Recent Comments
- admin_r0g1nuq9 on I built a working Hadoop-Spark-Hive cluster on Docker. Here is how.
- LJ on I built a working Hadoop-Spark-Hive cluster on Docker. Here is how.
- admin_r0g1nuq9 on I built a working Hadoop-Spark-Hive cluster on Docker. Here is how.
- Marius GrumÄzescu on I built a working Hadoop-Spark-Hive cluster on Docker. Here is how.
- admin_r0g1nuq9 on I built a working Hadoop-Spark-Hive cluster on Docker. Here is how.
Archives
Categories
Meta
-
Privacy & Cookies: This site uses cookies. By continuing to use this website, you agree to their use.
To find out more, including how to control cookies, see here: Cookie Policy
Category Archives: Howto
I built a working Hadoop-Spark-Hive cluster on Docker. Here is how.
TL;DR: I made a Docker compose that runs Hadoop, Spark and Hive in a multi-container environment. You can find the necessary files for it here: https://github.com/Marcel-Jan/docker-hadoop-spark How it started We at DIKW are working on a Certified Data Engineering … Continue reading
Posted in Howto, Learning Big Data, Spark
Tagged Apache Spark, Big Data Europe, DIKW, Docker, docker-compose, Hadoop, Hive
6 Comments
A humidity sensor network on a Raspberry Pi with Zigbee2MQTT
I was looking for a way to detect leakage in my appartement with some kind of IoT solution. Someone on the Dutch technology forum Tweakers.net told me Xiaomi Humidity sensors, combined with a Zigbee2MQTT might be a good fit. The … Continue reading
Posted in Howto
Tagged Domoticz, humidity sensors, Internet of Things, IoT, Raspberry Pi, Xiaomi Aqara, Zigbee2MQTT
Leave a comment
Neo4J: Loading rocket data in a graph database
When I first learned about graph databases, like Neo4J, I didn’t get it. That’s how I always start with new technology: not getting at all why people getting so enthusiastic about them. Then I read “Seven Databases in Seven Weeks, … Continue reading
Posted in Active Learning, Howto, NoSQL
Tagged Cypher, graph database, Jonathan's Space Page, Neo4J, Seven Databases in Seven Weeks
Leave a comment
Showing a complex Excel sheet who’s boss with Python and pandas
Data engineering isn’t always creating serverless APIs and ingressing terrabyte a minute streams with do-hickeys on Kubernetes. Sometimes people just want their Excel sheet in the data lake. Is that big data? Not even close. It’s very small. But for … Continue reading
Posted in Howto, Python
Tagged Excel, header, multiindex, pandas, Python, space fueling stations, stack, unstack
5 Comments
Building HDP 2.6 on AWS, Part 3: the worker nodes
This is part 3 in a series on how to build a Hortonworks Data Platform 2.6 cluster on AWS. By now we have an edge node to run Ambari Server, three master nodes for Hadoop name nodes and such. Now … Continue reading
Posted in Howto, Learning Big Data
Tagged Amazon Web Services, AWS, cloning nodes, Hadoop, HDP, Hortonworks Data Platform, Ubuntu Server, worker nodes
Leave a comment
Fun with Data: Python and space rocks!
Last week I had a little fun with playing with Python, the pandas and matplotlib library and a JSON file with asteroid data. Here is what I did.
Posted in Howto, Python
Tagged asteroids, matplotlib, pandas, Python, Science, sentdex
Leave a comment
Building HDP 2.6 on AWS, Part 2: the master nodes
This is part 2 in a series on how to build a Hortonworks Data Platform 2.6 cluster on AWS. In part 1 we created an edge node where we will later install Ambari Server. The next step is creating the … Continue reading
Posted in Howto, Learning Big Data
Tagged Amazon Web Services, AWS, cloning nodes, Hadoop, HDP, Hortonworks Data Platform, master node, Ubuntu Server
5 Comments
Building HDP 2.6 on AWS, Part 1: the edge node
Installing Hortonworks Data Platform 2.6 on Amazon Web Services (Amazon’s cloud platform), how hard could it be? It’s click, click, next, next, confirm, right? Well-lll, not quite. Especially if HDP or AWS is new to you. There are many steps … Continue reading
Posted in Howto, Learning Big Data
Tagged Amazon Web Services, AWS, edge node, Hadoop, HDP, Hortonworks Data Platform, Ubuntu Server
2 Comments