Ubuntu server 12.04通过Ambari搭建hadoop集群

Ambari是Apache Software Foundation 中的一个项目,方便用户创建、监控、管理Hadoop集群。

以往很多对集群的人工修改的集群配置,通过Ambari可以集中式修改。因此,Ambari号称大数据平台的利器确实当之无愧。

本文总结了Ambari的安装使用方法。

安装

参考官方文档Install Ambari 2.1.2 from Public Repositories.

从官方网站下载repo

  cd /etc/apt/sources.list.d
  wget <ambari-repo-url>

不同的操作系统<ambari-repo-url>不同,对于Ubuntu12.04,

  wget http://public-repo-1.hortonworks.com/ambari/ubuntu12/2.x/updates/2.1.2/ambari.list

安装配置ambari server

从repo安装ambari server

  apt-key adv --recv-keys --keyserver keyserver.ubuntu.com B9733A7A07513CAD
  apt-get update
  apt-get install ambari-server

配置ambari server

ambari-server setup

安装翻过来倒过去做了一晚上。注意尽量使用默认的配置,例如使用默认的postgresql,而不要使用mysql。使用默认的配置成功率高一些。

以下是我安装的log,我选择了root用户,jdk1.8,内置的postgreSQL。

# apt-get install ambari-server
Reading package lists... Done
Building dependency tree       
Reading state information... Done
ambari-server is already the newest version.
0 upgraded, 0 newly installed, 0 to remove and 114 not upgraded.
root@slave3:~# ambari-server setup
Using python  /usr/bin/python2.7
Setup ambari-server
Checking SELinux...
WARNING: Could not run /usr/sbin/sestatus: OK
Customize user account for ambari-server daemon [y/n] (n)? y
Enter user account for ambari-server daemon (root):
Adjusting ambari-server permissions and ownership...
Checking firewall status...
Checking JDK...
[1] Oracle JDK 1.8 + Java Cryptography Extension (JCE) Policy Files 8
[2] Oracle JDK 1.7 + Java Cryptography Extension (JCE) Policy Files 7
[3] Custom JDK
==============================================================================
Enter choice (1): 1
To download the Oracle JDK and the Java Cryptography Extension (JCE) Policy Files you must accept the license terms found at http://www.oracle.com/technetwork/java/javase/terms/license/index.html and not accepting will cancel the Ambari Server setup and you must install the JDK and JCE files manually.
Do you accept the Oracle Binary Code License Agreement [y/n] (y)? y
Downloading JDK from http://public-repo-1.hortonworks.com/ARTIFACTS/jdk-8u40-linux-x64.tar.gz to /var/lib/ambari-server/resources/jdk-8u40-linux-x64.tar.gz
jdk-8u40-linux-x64.tar.gz... 100% (165.2 MB of 165.2 MB)
Successfully downloaded JDK distribution to /var/lib/ambari-server/resources/jdk-8u40-linux-x64.tar.gz
Installing JDK to /usr/jdk64/
Successfully installed JDK to /usr/jdk64/
Downloading JCE Policy archive from http://public-repo-1.hortonworks.com/ARTIFACTS/jce_policy-8.zip to /var/lib/ambari-server/resources/jce_policy-8.zip

Successfully downloaded JCE Policy archive to /var/lib/ambari-server/resources/jce_policy-8.zip
Installing JCE policy...
Completing setup...
Configuring database...
Enter advanced database configuration [y/n] (n)? n
Configuring database...
Default properties detected. Using built-in database.
Configuring ambari database...
Checking PostgreSQL...
About to start PostgreSQL
Configuring local database...
Connecting to local database...done.
Configuring PostgreSQL...
Extracting system views...
ambari-admin-2.1.2.377.jar
......
Adjusting ambari-server permissions and ownership...
Ambari Server 'setup' completed successfully.

启动ambari-server

# ambari-server start
Using python  /usr/bin/python2.7
Starting ambari-server
Ambari Server running with administrator privileges.
About to start PostgreSQL
Organizing resource files at /var/lib/ambari-server/resources...
Server PID at: /var/run/ambari-server/ambari-server.pid
Server out at: /var/log/ambari-server/ambari-server.out
Server log at: /var/log/ambari-server/ambari-server.log
Waiting for server start....................
Ambari Server 'start' completed successfully.

使用ambari快速搭建hadoop集群

本文采用两个机器搭建的hadoop集群。ip分别为192.168.1.126和192.168.1.38。126的机器上作为ambari-server

ambari-server是安装在8080端口的,通过安装ambari-serverip:port打开。默认用户名:admin,密码:admin。如图:

login

选择Launch Install Wizard:

launch

  1. Get Started. 填入集群名称:

    get started

  2. Select Stack. 这里选择最新的HDP2.3。

    select stack

  3. Install Options。这步卡了好久。两点注意:target hosts必须填入FQDN,即形如xxx.xxx.com的域名形式,而不能是简单的hostname。第二点是要保证安装server的机器能够无密码ssh到集群中的其他节点,注意这里最好是可以ssh到其他节点的root账户。

    SSH设置网上有好多教程,不再赘述。这里说说FQDN的设置。注意必须修改集群中每台主机的hosts文件。

     $ sudo vim /etc/hosts
    

    注释掉127.0.1.1行,并在末尾添加两台机器的ip与域名对应关系。第二列即为FQDN。修改后的文件内容如下所示:

     127.0.0.1       localhost
    
     # The following lines are desirable for IPv6 capable hosts
     ::1     ip6-localhost ip6-loopback
     fe00::0 ip6-localnet
     ff00::0 ip6-mcastprefix
     ff02::1 ip6-allnodes
     ff02::2 ip6-allrouters
    
     192.168.1.126 slave3.test.com slave3
     192.168.1.38  master.test.com master
    

    可以通过hostname -f来验证FQDN。

     $ hostname -f
     slave3.test.com
    

    Host Registration Information中填入用于ssh到集群内其他节点root账号的私钥文件。

  4. Confirm Hosts. 等待安装ambari-agent。安装完后会有warning。需要在每台主机上安装ntp协议。这时一种用于在多台主机之间进行时间同步的协议。在每台主机上使用以下命令即可。

     $ sudo apt-get install ntp
    

    Confirm Hosts

    安装完成后点击Rerun Checks按钮,这时warning消失。

    Confirm Hosts

  5. Choose Services. 选择所需要的服务。这个根据自己的需要了。组件之间有依赖关系,如果没有选择被依赖的组件,ambari会加以提示。本文的选择安装了Hive/HBase/HDFS/Yarn/Spark/Flume/Pig等组件。

  6. Assign Master. 选择master安装的机器。这个可以选择默认,也可以根据需要自行调整。本文采用默认设置。

    Assign Master

  7. Assign Slaves and Clients. 本文采用默认配置。

    Assign Slaves

  8. Customize Services. 这一步需要配置hive密码。

    Customize Services

  9. Review. 直接跳过。

  10. 然后进入漫长的安装阶段。要等好久!可以去吃个午饭。

    Install, Start and Test

写于2015年10月30日