apache - Setting up a Hadoop Cluster on Amazon Web services with EBS -
i wondering how setup hadoop cluster (say 5 nodes) through aws. know how create cluster on ec2 don't know how face following challenges.
- what happens if lose spot instance. how keep cluster going.
- i working datasets of size 1tb. possible setup ebs accordingly. how can access hdfs in scenario.
any great!
depending on requirements, these suggestions change. however, assuming 2 master , 3 worker setup, can use r3 instances master nodes memory intensive app optimized , go d2 instances worker nodes. d2 instances have multiple local disks , can withstand disk failures while still keeping data safe.
answer specific questions,
- treat hadoop machines linux applications. happen if general centos spot instances lost? hwnce, advised use reserved instances.
- hadoop typically stores data maintaining 3 copies , distributing them across worker nodes in forms of 128 or 256 mb blocks. so, have 3tb data store across 3 worker nodes. obviously, have consider overhead while calculating space requirements.
Comments
Post a Comment