azurehadoopclouderacloudera-managercloudera-director

Explanation of Cloudera architecture on cloud (Azure)


I am new to Hadoop/Cloudera world, I need to setup a Cloudera cluster on Microsoft Azure cloud. If I understood correctly there are two methods to install Cloudera on a cluster: using Cloudera Manager or thorugh a manual installation. According to this schema it seems it is needed a dedicated machine for Cloudera Manager and 3 Master Nodes.

enter image description here

But in this table it seems I can install Cloudera Manager directly on the Master Node.

enter image description here

So here are my doubts/questions:

Thanks in advance for any information.


Solution

  • You can see from Cloudera documentation at https://www.cloudera.com/documentation/enterprise/5-8-x/topics/cm_ig_host_allocations.html that you can have a varying number of master nodes depending on your cluster size and high availability requirements:

    Similarly, the utility host used for Cloudera Manager is used for all Utility and Edge roles in the first two cases above, and then more utility hosts are shown as the cluster size gets larger, with the Cloudera Manager in those cases being the only utility run on its host.

    https://www.cloudera.com/products/product-components/cloudera-director.html describes Cloudera Director, which is a tool to help you run Hadoop clusters in public cloud (AWS/Azure/Google Cloud). Cloudera Director works with Cloudera Manager to provide centralised administration of cloud clusters. https://www.cloudera.com/documentation/director/2-2-x/topics/director_cdh_cluster_management.html is also a useful reference for the differences between Cloudera Director and Cloudera Manager.