This is part 3 in a series on how to build a Hortonworks Data Platform 2.6 cluster on AWS. By now we have an edge node to run Ambari Server, three master nodes for Hadoop name nodes and such. Now we need worker nodes for processing the data.

Creating the worker nodes is not that much different from creating the master nodes. But the workers need more powerful nodes.

Creating the first worker node

Log in at Amazon Web Services again, in the same AWS district as the edge and master nodes. We start with one worker node and clone 2 more later on. Go to the EC2 dashboard in the AWS interface and click “Launch instance”. Then choose Ubuntu Server 16.04 from the Amazon Machine Images.

For the workers we need machines with a little more oomph. Select a general purpose instance with type m4.2xlarge.

Networking

The worker nodes will be in the same subnet as the masters.

Storage

This is possibly more storage than necessary. 100GB for root, 25 GB of type EBS for /dev/sdc and 100 GB of type EBS for /dev/sdb.

Review

Extra preparation to install Ambari and HDP

You need unzip:

sudo apt-get update

sudo apt install unzip

Elastic IP

Create a new Elastic IP and associate the worker node with it.

Clone the worker node

Now we clone the first worker node to two new workers. This works in the same way as we cloned the master nodes, except this time m4.2xlarge nodes were chosen.

Change the tags in the instance list:

The nodes have the same software as the first worker node, but passworless access is something you have to configure on all of them. You need to put the worker nodes in the de OCS-POC edge – sg security group and associate them, one by one, with the Elastic IP, so you can log in directly.

Test the connection

When the master nodes are started, you should have access as root from the edge nodes.

Building HDP 2.6 on AWS, Part 3: the worker nodes

Published by Marcel-Jan Krijgsman on April 10, 2018April 10, 2018

Creating the first worker node

Networking

Storage

Tags

Review

Extra preparation to install Ambari and HDP

Elastic IP

Clone the worker node

Test the connection

0 Comments

Leave a Reply Cancel reply

Showing a gift total on a Raspberry Pi with an e-ink display – how hard could it be?

A Strava dashboard on a Raspberry Pi (Part 3): The Strava API

A Strava dashboard on a Raspberry Pi (Part 2): Installing software

Building HDP 2.6 on AWS, Part 3: the worker nodes

Published by Marcel-Jan Krijgsman on April 10, 2018April 10, 2018

Creating the first worker node

Networking

Storage

Tags

Review

Extra preparation to install Ambari and HDP

Elastic IP

Clone the worker node

Test the connection

0 Comments

Leave a Reply Cancel reply

Related Posts

Showing a gift total on a Raspberry Pi with an e-ink display – how hard could it be?

A Strava dashboard on a Raspberry Pi (Part 3): The Strava API

A Strava dashboard on a Raspberry Pi (Part 2): Installing software