I think docker is simplifying the big data dev ops concerns by a factor of 10x or more.
It is easy enough for me to just run a single command and bring to life any specific distribution of Hadoop in docker containers.
To get a flavor of it, thought of writing this blog entry. In the part 1 of this blog I had set up linux container based environment. In this entry, I am posting docker based environment set up.
Step 1: Install Docker
Step 2: Install Kubernetes with Kubectl
In my case I do not want to mess with my laptop so I use a VM centos6.6 on my macbook pro. That way it is one extra step to start-up the VM, but it keeps my host laptop free of installations and configurations.
Once both step #1 and step #2 are working for you,
Here is how you will launch a hadoop instance.
Step 3: Create a PoD Definition for Kubernetes. Pick any available Hadoop image from Docker hub.
[dockeruser@centos6 docker-for-hadoop]$ vi hbase-single-node-pod.yaml apiVersion: v1 kind: Pod metadata: name: hbase-single-node-pod labels: name: hbase-single-node-pod spec: containers: - name: hbase image: 'santanu77/hadoop-docker' ports: - containerPort: 60000 hostPort: 60000 - containerPort: 60010 hostPort: 60010 - containerPort: 8088 hostPort: 8088
[dockeruser@centos6 docker-for-hadoop]$ kubectl create -f hbase-single-node-pod.yaml pods/hbase-single-node-pod
[dockeruser@centos6 docker-for-hadoop]$ kubectl describe pod hbase-single-node Name: hbase-single-node-pod Namespace: default Image(s): santanu77/hadoop-docker Node: 127.0.0.1/127.0.0.1 Labels: name=hbase-single-node-pod Status: Running Reason: Message: IP: 172.17.0.1 Replication Controllers: <none> Containers: hbase: Image: santanu77/hadoop-docker State: Running Started: Thu, 10 Sep 2015 23:55:16 -0400 Ready: True Restart Count: 0 Conditions: Type Status Ready True No events.
I can hit the hadoop cluster manager service from my host as well given that it the port 8088 was mapped to the hosts port. So I can access it using my VM’s static IP and port 8088.