Author Archives: Marcel-Jan Krijgsman

About Marcel-Jan Krijgsman

In 2017 I made the leap to Big Data after 20 years of experience with Oracle databases. I followed courses on Hadoop, Big Data Analytics, Machine Learning and Python, MongoDB and Elasticsearch.

I feel great when I study

When I started studying Hadoop, Python and machine learning in 2016, I found something out that I didn’t expect. I feel better when I study. When I finished another problem, exam or course, and I stepped outside the house to … Continue reading

Posted in Learning Big Data | Tagged , | Leave a comment

Playing with asteroids data in MongoDB

If there is one thing I learned when becoming a data engineer, it’s that having just Hadoop expertise is probably not enough. For starters: what it means to be a data engineer is not exactly sharply defined. Some say data … Continue reading

Posted in NoSQL | Tagged , , , , | Leave a comment

Hadoop in a Hurry – Security

When talking about Hadoop security there are so many products and features. What do all of them do? This video gives a high over overview.

Posted in Apache Products for Outsiders, DBA2Hadoop | Tagged , , , , , , , , , | Leave a comment

I tried Lion’s Mane as a cognitive enhancer. Here are my experiences with it.

TL;DR I tried Lion’s Mane from Four Sigmatic, which is branded as a cognitive enhancer. I’ve used it while studying Deep Neural Networks, amongst other things. I’ve done alternate weeks with and without Lion’s Mane and in my experience the … Continue reading

Posted in Weird experiments | Tagged , , , | Leave a comment

Recovering your HDP 2.6.1 Sandbox on VirtualBox after a restart

If you’ve worked with the Hortonworks Data Platform 2.x sandbox of later versions in VirtualBox and made it shutdown rather vigorously, you might have noticed that you won’t get past this startup screen when you try to start it up … Continue reading

Posted in Apache Products for Outsiders, Howto, Learning Big Data | Tagged , , , , , , , , , | 2 Comments

Tutorial: Let’s throw some asteroids in Apache Hive

This is a tutorial on how to import data (with fixed lenght) in Apache Hive (in Hortonworks Data Platform 2.6.1). The idea is that any non-Hive, non-Hadoop savvy people can follow along, so let me know if I succeeded (make … Continue reading

Posted in Apache Products for Outsiders, Learning Big Data | Tagged , , , , , , , , | Leave a comment

Fun with Data: Python and space rocks!

Last week I had a little fun with playing with Python, the pandas and matplotlib library and a JSON file with asteroid data. Here is what I did.

Posted in Howto, Python | Tagged , , , , , | Leave a comment

Hadoop High Availability In A Hurry – Part 2: YARN

If you don’t know a lot about YARN and why it’s called a data operating system, you’re in luck. I found it necessary to explain how YARN works before I could explain the solutions for high availability. At first YARN … Continue reading

Posted in Apache Products for Outsiders, Learning Big Data | Tagged , , , , , , | 1 Comment

Hadoop High Availability In A Hurry – Part 1: HDFS

I’ve been studying for a couple of hours how Hadoop high availability works, for the HDPCA exam. And now I’ve condensed that knowledge to a video on HDFS HA in just under 9 minutes. Enjoy!

Posted in Apache Products for Outsiders, Learning Big Data | Tagged , , , , , , , , , , , | 1 Comment

Certifying as HDP Certified Administrator

Let’s talk about certification. The thing by which you try to show potential employers and customers that you actually know what you are doing at work. My only experience up to last Tuesday with IT product-related certifications was with Oracle’s … Continue reading

Posted in Learning Big Data | Tagged , , , | Leave a comment