Author Archives: Marcel-Jan Krijgsman
If there is one thing I learned when becoming a data engineer, it’s that having just Hadoop expertise is probably not enough. For starters: what it means to be a data engineer is not exactly sharply defined. Some say data … Continue reading
TL;DR I tried Lion’s Mane from Four Sigmatic, which is branded as a cognitive enhancer. I’ve used it while studying Deep Neural Networks, amongst other things. I’ve done alternate weeks with and without Lion’s Mane and in my experience the … Continue reading
If you’ve worked with the Hortonworks Data Platform 2.x sandbox of later versions in VirtualBox and made it shutdown rather vigorously, you might have noticed that you won’t get past this startup screen when you try to start it up … Continue reading
This is a tutorial on how to import data (with fixed lenght) in Apache Hive (in Hortonworks Data Platform 2.6.1). The idea is that any non-Hive, non-Hadoop savvy people can follow along, so let me know if I succeeded (make … Continue reading
Last week I had a little fun with playing with Python, the pandas and matplotlib library and a JSON file with asteroid data. Here is what I did.
If you don’t know a lot about YARN and why it’s called a data operating system, you’re in luck. I found it necessary to explain how YARN works before I could explain the solutions for high availability. At first YARN … Continue reading
I’ve been studying for a couple of hours how Hadoop high availability works, for the HDPCA exam. And now I’ve condensed that knowledge to a video on HDFS HA in just under 9 minutes. Enjoy!
Let’s talk about certification. The thing by which you try to show potential employers and customers that you actually know what you are doing at work. My only experience up to last Tuesday with IT product-related certifications was with Oracle’s … Continue reading