前言
最近准备接手一些大数据的东西了,在接手之前肯定是需要对大数据的生态以及常用的软件配置使用等等都要有一个比较基础的认知,为此我会写几篇文章关于大数据的那些事情,这篇文章就是关于hadoop的。
hadoop介绍
hadoop 是apache基金会开源出来的并行处理工具
Hbase:十一个nosql的数据库,类似于mongodb
HDFS: hadoop distribut file system, hadoop的分布式文件系统
Zookeeper是分布式管理协助框架,Zookeeper集群在这里用于保证Hadoop集群的高可用。
高可用原理介绍
Zookeeper集群能够保证NameNode服务高可用省得原理是:Hadoop集群中有2个NameNode服务,两个Namenode服务都定时给Zookeeper发送心跳,告诉Zookeeper我还或者,可以提供服务,单个时间点只有一个是Action的状态,另一个是Standby状态,一旦Zookeeper检测不到Action NameNode发送来的心跳之后,就会切换到Standby状态的NameNode上,将它设置为Action NameNode的状态,以此来达到NameNode高可用的目的。
Zookeeper集群本身也可以保证自身的高可用,Zookeeper集群中的各个节点分为Leader、Follower两个。
当写数据的时候需要先写入Leader节点,Leader写入之后再通知Follower写入。
客户端读取数据的时候,因为数据都是一样的,可以从任意一台机器上进行读取数据。
Zookeeper当Leader节点发生故障的时候,就会进行选举流程。这个流程是:集群中任何一台机器发现集群中没有Leader的时候,就会推荐自己为Leader,其他机器来发起投票同意,当超过一半的机器同意它为Leader的时候,选举结束。
所以Zookeeper集群中的机器数量必须是奇数这样就算是Leader拒绝服务也会很快选出新的Leader,从而保证了Zookeeper集群的高可用性。
ZK的三个角色
HDFS HA原理
单点的NameNode的缺陷在于单点故障的问题,如果NameNode不可用会导致整个HDFS系统不可用,所以需要设计高可用的HDFS来解决NamNode单点故障的问题。及觉得方法就是再HDFS集群中设置多个NameNode节点,但是一旦引入多个NameNode就有一些问题需要解决。
- 如何保证NameNode内存中元数据一致,并保证编辑日志文件的安全性。
- 多个NameNode如何进行协作
- 客户端如何能够找到可用的NameNode
- 如何保证任意时刻只有一个NameNode处于对外服务的状态。
以下是引入ZK的解决方法
- 对于保证NameNode元数据的一致性和编辑日志的安全性,采用Zookeeper来存储日志文件。
- 两个NameNode一个是Active状态的,一个是Standby状态的,一个时间点上之能够有一个为Active状态的NameNode。
软件兼容表
大数据的生态比较复杂,像是常用的hbase、hive都是对安装的软件是有些要求的,这里我总结了一个表格方便去查看和使用。以避免因为版本不兼容带来的各种各样的问题。
hadoop以及必要的组件版本如下:
软件名 | 版本 | 备注 |
---|
hadoop | 3.3.0 | stable |
zookeeper | 3.6.3 | stable |
hive | 3.1.2 | stable |
hbase | 2.3.5 | stable |
环境准备
这里准备了3台虚拟机来做这次的实验:
主机名 | OS | ip |
---|
hadoop-node1 | CentOS7 | 192.168.122.40 |
hadoop-node2 | CentOS7 | 192.168.122.41 |
hadoop-node3 | CentOS7 | 192.168.122.42 |
海豚调度版本如下:
软件名 | 版本 | 备注 |
---|
apache-dolphinscheduler | 1.3.6 | latest-bin |
角色描述:
描述 | hadoop-node1 | hadoop-node2 | hadoop-node3 |
---|
HDFS主 | NameNode | NameNode | |
HDFS从 | DateNode | DataNode | DataNode |
Yarn主 | ResourceManager | Resourcemanager | |
Yarn从 | NodeManager | NodeManager | NodeManager |
Hbase主 | HMaster | Hmaster | |
Hbase从 | HRegionServer | HRegionserver | HRegionserver |
Zookeeper进程 | QuorumPeerMain | Quorumpeermain | Quorumpeermain |
NameNode数据同步 | JournalNode | JournalNode | Journalnode |
主备切换 | DFSZKFailoverController | DFSzkfailovercontroller | |
节点配置
关闭ipv6
在/etc/sysctl.conf
文件后追加内容如下:
1
2
3
| net.ipv6.conf.all.disable_ipv6=1
net.ipv6.conf.default.disable_ipv6=1
net.ipv6.conf.lo.disable_ipv6=1
|
同时还需要修改grub
1
| GRUB_CMDLINE_LINUX="crashkernel=auto ... ipv6.disable=1"
|
更新grub:
关闭密码认证
出于安全考虑每个节点关闭掉密码认证使用密钥认证的方式。
在关掉之前我们需要在本地生成一个密钥,如果有的话直接将公钥放到节点root
用户的~/.ssh/authorized_keys
目录下即可。
如果没有密钥可以执行:
然后执行:
1
2
3
| ssh-copy-id root@hadoop-node1.nil.ml
ssh-copy-id root@hadoop-node2.nil.ml
ssh-copy-id root@hadoop-node3.nil.ml
|
关闭ssh的密码认证:
修改/etc/ssh/sshd_config
配置文件,修改以下内容:
1
2
| PasswordAuthentication no
ChallengeResponseAuthentication no
|
保存并退出之后重启sshd
服务
集群规划
节点以及角色的对应表格:
vm name | role |
---|
ntp | ntp server |
dns | dns server |
kerberos-server | kerberos server |
hadoop-node1.nil.ml | namenode-master, datanode, journalnode, nodemanager, zookeeper, DFSZKFailoverController |
hadoop-node2.nil.ml | Namenode-Slave, DFSzkfailovercontroller, Resourcemanager-master, Datanode, JournalNode, NodeManager, Zookeeper |
hadoop-node3.nil.ml | Resourcemanager-Slave, DataNode, JournalNode, Nodemanager, Zookeeper |
所用到用户如下表:
用户名 | 权限 |
---|
zookeeper | 普通用户(启动zk) |
dolphinscheduler | dolphinscheduler 管理用户 |
hdfs | HDFS 运行用户 |
hive | Hive 运行用户 |
使用到的组:
组名 | 用户成员 | 说明 |
---|
hadoop | hdfs, hive, dolphinscheduler | 大数据成员 |
端口说明
端口以及对应的服务如下表:
| 组件 | 服务 | 端口 | 配置 | 注解 |
|———–|——|——|
| | | |
虚拟机网络
虚拟机的网络分为:192.168.122.0/24
配置虚拟机的ip和主机名,关闭selinux和防火墙最后重启:
1
2
3
4
5
| hostnamectl set-hostname hadoop-node1.nil.ml
nmcli con mod eth0 ipv4.method manual ipv4.addresses 192.168.122.40/24 ipv4.gateway 192.168.122.1 ipv4.dns "192.168.122.1, 114.114.114.114" connection.autoconnect yes
sed -i 's/SELINUX=enforcing/SELINUX=disabled/g' /etc/selinux/config
systemctl disable firewalld
reboot
|
1
2
3
4
5
| hostnamectl set-hostname hadoop-node2.nil.ml
nmcli con mod eth0 ipv4.method manual ipv4.addresses 192.168.122.41/24 ipv4.gateway 192.168.122.1 ipv4.dns "192.168.122.1, 114.114.114.114" connection.autoconnect yes
sed -i 's/SELINUX=enforcing/SELINUX=disabled/g' /etc/selinux/config
systemctl disable firewalld
reboot
|
1
2
3
4
5
| hostnamectl set-hostname hadoop-node3.nil.ml
nmcli con mod eth0 ipv4.method manual ipv4.addresses 192.168.122.42/24 ipv4.gateway 192.168.122.1 ipv4.dns "192.168.122.1, 114.114.114.114" connection.autoconnect yes
sed -i 's/SELINUX=enforcing/SELINUX=disabled/g' /etc/selinux/config
systemctl disable firewalld
reboot
|
添加管理用户
创建hadoop
组:
sudoer 设置:
编辑/etc/sudoers
文件
1
| %hadoop ALL=(ALL) NOPASSWD: ALL
|
创建管理用户
1
2
| useradd -m -G hadoop user -s /bin/bash
echo "user:changeme" | sudo chpasswd
|
创建hdfs账户:
1
2
| sudo useradd -m hdfs -s /bin/bash
echo "hdfs:changeme" | sudo chpasswd
|
创建hive账户:
1
2
| sudo useradd -m hive -s /bin/bash
echo "hive:changeme" | sudo chpasswd
|
创建hbase账户:
1
2
| sudo useradd -m hbase -s /bin/bash
echo "hbase:changeme" | sudo chpasswd
|
创建spark用户:
1
2
| sudo useradd -m spark -s /bin/bash
echo "spark:changeme" | sudo chpasswd
|
创建dolphinscheduler账户:
1
2
| sudo useradd -m dolphinscheduler -G hadoop -s /bin/bash
echo "dolphinscheduler:changeme" | sudo chpasswd
|
添加zookeeper
的用户:
1
2
| sudo useradd -m zookeeper -s /bin/bash
echo "zookeeper:changeme" | sudo chpasswd
|
锁定 root用户
首先复制一下密钥,以免等下我们登录不进去系统。
切换到user
用户
创建文件夹:
创建并编辑~/.ssh/authorized_keys
文件,内容是你的公钥。
保存退出之后更改文件权限:
1
2
| chmod 700 ~/.ssh
chmod 600 ~/.ssh/authorized_keys
|
验证是否可以正常连接到节点:
1
2
3
| ssh user@hadoop-node1.nil.ml
ssh user@hadoop-node1.nil.ml
ssh user@hadoop-node1.nil.ml
|
没有问题之后锁定root账户:
创建上传包目录
创建的目录以及对应的访问权限如下表:
目录 | 权限 | 备注 |
---|
/opt/packages | hadoop: g rwx | 所有包存放的地方 |
1
2
| sudo mkdir -pv /opt/packages
sudo chown -R user /opt/packages
|
NTP配置
在集群里面节点的时间同步是非常重要的部分,这里来配置一下时间同步服务器。
首先调整一下时区:
1
| sudo timedatectl set-timezone Asia/Shanghai
|
在hadoop-node1.nil.ml
节点上配置NTP服务器,首先安装chrony
:
1
| sudo yum install chrony -y
|
修改/etc/chrony.conf
配置文件,修改内容如下:
1
2
| server ntp.aliyun.com iburst
allow 192.168.122.0/24
|
开机启动chronyd
服务并启动:
1
| sudo systemctl enable chronyd.service --now
|
查看服务状态:
1
| sudo systemctl status chronyd.service
|
其他节点上也部署 chrony:
1
| sudo yum install chrony -y
|
修改/etc/chrony.conf
配置文件:
1
| sudo vi /etc/chrony.conf
|
修改内容如下:
1
| server hadoop-node1.nil.ml iburst
|
重启chronyd
服务:
1
| sudo systemctl restart chronyd.service
|
查看同步状态:
文件句柄:
1
2
3
4
5
6
| cat>/etc/security/limits.d/hadoop.conf<<EOF
* soft nproc 131072
* hard nproc 131072
* soft nofile 131072
* hard nofile 131072
EOF
|
安装JDK环境
这里安装的是openjdk8的版本:
1
| sudo yum install -y java-1.8.0-openjdk java-1.8.0-openjdk-devel java-1.8.0-openjdk-headless
|
配置一下shell的变量,使得用户可以直接使用java。
编辑/etc/profile
文件
这个JAVA_HOME
地址如何找可以直接查看/usr/bin/java
指向那里然后看链接的位置得到
在末尾添加:
1
| export JAVA_HOME=/usr/lib/jvm/java-1.8.0-openjdk/
|
保存退出之后,运行:
1
| source /etc/profile && java -version
|
如果成功输出以下内容就说明我们的Java环境已经配置成功了:
1
2
3
| openjdk version "1.8.0_292"
OpenJDK Runtime Environment (build 1.8.0_292-b10)
OpenJDK 64-Bit Server VM (build 25.292-b10, mixed mode)
|
上传包
首先确保每个节点上已经安装了rsync:
1
| sudo yum install rsync -y
|
从本地的包上传到各个服务器:
1
2
3
| rsync -avz . -e ssh user@hadoop-node1.nil.ml:/opt/packages
rsync -avz . -e ssh user@hadoop-node2.nil.ml:/opt/packages
rsync -avz . -e ssh user@hadoop-node3.nil.ml:/opt/packages
|
安装和zookeeper
安装zookeeper
解压二进制包:
1
2
3
| cd /opt/packages/
sudo tar -xf apache-zookeeper-3.6.3-bin.tar.gz
sudo mv apache-zookeeper-3.6.3-bin /opt/zookeeper
|
创建数据目录
1
| sudo mkdir -pv /opt/zookeeper/data
|
更改权限:
1
| sudo chown -R zookeeper:zookeeper /opt/zookeeper
|
配置zookeeper
切换到zookeeper
用户:
创建并编辑配置文件/opt/zookeeper/conf/zoo.cfg
:
1
| vi /opt/zookeeper/conf/zoo.cfg
|
内容如下:
1
2
3
4
5
6
7
8
| tickTime=2000
initLimit=10
syncLimit=5
dataDir=/opt/zookeeper/data
clientPort=2181
server.1=hadoop-node1.nil.ml:2888:3888
server.2=hadoop-node2.nil.ml:2888:3888
server.3=hadoop-node3.nil.ml:2888:3888
|
配置id:
hadoop-node1.nil.ml
1
| echo "1" > /opt/zookeeper/data/myid
|
hadoop-node2.nil.ml
1
| echo "2" > /opt/zookeeper/data/myid
|
hadoop-node3.nil.ml
1
| echo "3" > /opt/zookeeper/data/myid
|
启动zookeeper并检查状态
为了方便管理我们这里为zookeeper创建一个systemd的service。
创建并编辑/lib/systemd/system/zookeeper.service
1
| sudo vi /lib/systemd/system/zookeeper.service
|
内容如下:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
| [Unit]
Description=Zookeeper Daemon
Documentation=http://zookeeper.apache.org
Requires=network.target
After=network.target
[Service]
Type=forking
WorkingDirectory=/opt/zookeeper
User=zookeeper
Group=zookeeper
ExecStart=/opt/zookeeper/bin/zkServer.sh start /opt/zookeeper/conf/zoo.cfg
ExecStop=/opt/zookeeper/bin/zkServer.sh stop /opt/zookeeper/conf/zoo.cfg
ExecReload=/opt/zookeeper/bin/zkServer.sh restart /opt/zookeeper/conf/zoo.cfg
TimeoutSec=30
Restart=on-failure
[Install]
WantedBy=default.target
|
启动服务:
1
| sudo systemctl start zookeeper
|
查看服务状态:
1
| sudo systemctl status zookeeper
|
加入到开机启动:
1
| sudo systemctl enable zookeeper
|
安装hadoop
免密
这里使用hdfs
用户来做免密:
首先切换到hdfs
用户:
生产密钥(三个节点都需要):
这一步在hadoop-node1
上操作:
1
| cat .ssh/id_rsa.pub >> .ssh/authorized_keys
|
这一步用于生成.ssh/authorized_keys
文件。
接下来我们添加其他两个公钥
在其他的两个节点上查看生成的公钥然后添加到当前的.ssh/authorized_keys
文件内:
将这个文件内容.ssh/authorized_keys
复制到其他的两个点上。
创建并编辑~/.ssh/authorized_keys
内容就是从hadoop-node1.nil.ml
复制过来的:
1
2
3
| ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQC2tDyregbTpxwyPuTNwQy769G8gs+bd3CuRyneo3HomDHRZnx6vE14aLdHs8k1KK7ko3c3eKZ83zrytKbLv9Eq5zH22kmNG2Xp1fiXMGDex81SZ9qrPI2IXW6Dtk82w8nH8XGs+2BcA71RZWzXGBc+CJfUPnEyhKqsZpTFP8FZko/i8ptb9ShghY614etXNzKy9g0O0s9WD9rdBw/QoOC3xybD/aFfbZP+YFgPZSywT8ThXkdJhDucDS6WG9yvASxUAdXyPkjbdrBh0y/FzF3qKbvEunszWo7I27nXndQ8ew3uIp7p+rfjy6bDtgnyvhvSqXaKQ72umzIz/cvlWE17 hdfs@hadoop-node1.nil.ml
ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQC14y2KT1a4rBBKFLNH3vB1SVEOcV0nrVTFouMFIppADRMVqB2eEoB5p57jWn7vj+RrXPxFDP/qfoj6fO3M6KyHRg+mx0JJiT+LPNhb5M5tacPN79aDQ3Js/hZJHDWjerv42YdkuOkfozhIB8wAti7Pvc/C6/n2MjJly9PjH+mAC5WQl0QbLJmqTnS1yfmCFVhIYhTF9wS0GmTVHdspEcvXHKiNo0QpEmf1ezNPIcO4V5cMhbfdx2mstmx8OWQQdcZ7zBYDOOx5NpdIB5BxSp2yEDQjOfkEr8uc0QH2DhmTVCHwWS2tL+XYdkJPEuGoqQp4EZopeuFfM1gOZZIJsUTl hdfs@hadoop-node2.nil.ml
ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQDDqq8HXxd4c8/6dc371PkRWVhAyf239bW7w3FU3Qs3ZMcMWbu2UwqF5hlX/Wr55lH4VkvEkMywS4VZBRnr3mHiE+RpPyNdETpb9TAkAp+lHWpbkVVq+reUIpURKaqLZQiq535okktddNLOpZ0bKX4dAR4Iqs8H7OV+EFcSkoZo8BQzI4tuERlAmbJ6D2Ft4pjmE0ii4br0BjSQtAZcEjrc8JGAvus8sQ45UBVAARbk8zL0ekL67Yn7aA2Bws0UYDyX+iVCLrCBVqC0ftI7S39s6nP+50rjQX95ZTvtuPQk23JObN+FnW1acKmZbBkFgBMsoUTM95f3c8kmfpaX94O3 hdfs@hadoop-node3.nil.ml
|
更改权限:
1
2
| chmod 700 ~/.ssh
chmod 600 ~/.ssh/authorized_keys
|
验证一下是否成功:
1
2
| ssh hadoop-node2.nil.ml "hostname"
ssh hadoop-node3.nil.ml "hostname"
|
下载hadoop
这里下载我们需要的软件包
1
2
| cd /opt/packages
wget -c https://mirrors.tuna.tsinghua.edu.cn/apache/hadoop/common/hadoop-3.2.2/hadoop-3.2.2.tar.gz
|
解压,并放到单独的目录中:
1
2
| sudo tar -xf hadoop-3.2.2.tar.gz
sudo ln -sf /opt/packages/hadoop-3.2.2 /opt/hadoop
|
更改权限:
1
| sudo chown -R hdfs:hdfs /opt/hadoop/
|
在 /etc/profile
文件末尾追加:
1
2
3
4
5
6
7
8
9
| export HADOOP_HOME=/opt/hadoop
export HADOOP_INSTALL=$HADOOP_HOME
export HADOOP_MAPRED_HOME=$HADOOP_HOME
export HADOOP_COMMON_HOME=$HADOOP_HOME
export HADOOP_HDFS_HOME=$HADOOP_HOME
export HADOOP_YARN_HOME=$HADOOP_HOME
export HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_HOME/lib/native
export PATH=$PATH:$HADOOP_HOME/sbin:$HADOOP_HOME/bin
export HADOOP_OPTS="-Djava.library.path=$HADOOP_HOME/lib/native"
|
创建hdfs的文件夹(每个节点):
1
2
| sudo mkdir -pv /opt/hdfs/{namenode,datanode,journalnode}
sudo chown -R hdfs:hadoop /opt/hdfs/
|
注意切换到hdfs用户:
修改hadoop的配置
1
| vi $HADOOP_HOME/etc/hadoop/hadoop-env.sh
|
添加JAVA_HOME
的配置:
1
| export JAVA_HOME=/usr/lib/jvm/java-1.8.0-openjdk/
|
创建hadoop的临时目录:
1
| mkdir -pv /opt/hadoop/tmp/
|
在Master节点中,修改以下三个配置文件:
修改/opt/hadoop/etc/hadoop/core-site.xml
文件
1
| vi /opt/hadoop/etc/hadoop/core-site.xml
|
内容如下:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
| <?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
<property>
<name>hadoop.tmp.dir</name>
<value>/opt/hadoop/tmp</value>
<final>true</final>
<!-- base for other temporary directories -->
</property>
<property>
<name>fs.defaultFS</name>
<value>hdfs://hadoop-cluster/</value>
</property>
<property>
<name>io.file.buffer.size</name>
<value>131072</value>
</property>
<!-- zk ha -->
<property>
<name>ha.zookeeper.quorum</name>
<value>hadoop-node1.nil.ml:2181,hadoop-node2.nil.ml:2181,hadoop-node3.nil.ml:2181</value>
</property>
<property>
<name>ha.zookeeper.session-timeout.ms</name>
<value>1000</value>
<description> connection zookeeper timeout ms</description>
</property>
<property>
<name>ipc.client.connect.max.retries</name>
<value>1000</value>
<description>Indicates the number of retries a client will make to establish a server connection.</description>
</property>
<property>
<name>ipc.client.connect.retry.interval</name>
<value>10000</value>
<description>Indicates the number of milliseconds a client will wait for before retrying to establish a server connection.</description>
</property>
<property>
<name>hadoop.proxyuser.root.hosts</name>
<value>*</value>
</property>
<property>
<name>hadoop.proxyuser.root.groups</name>
<value>*</value>
</property>
</configuration>
|
hdfs-site.xml
1
| vi /opt/hadoop/etc/hadoop/hdfs-site.xml
|
hadoop-node1 节点:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
| <?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
<!-- 存储的副本数量 -->
<property>
<name>dfs.replication</name>
<!-- if 3 datanode this value should be 2 -->
<value>2</value>
</property>
<!-- namenode 数据的存放位置 -->
<property>
<name>dfs.name.dir</name>
<value>/opt/hdfs/namenode</value>
</property>
<!-- datanode 数据的存放位置 -->
<property>
<name>dfs.data.dir</name>
<value>/opt/hdfs/datanode</value>
</property>
<!-- JournalNode在本地磁盘存放数据的位置 -->
<property>
<name>dfs.journalnode.edits.dir</name>
<value>/opt/hdfs/journalnode</value>
</property>
<!-- secondary 节点信息 -->
<property>
<name>dfs.namenode.secondary.http-address</name>
<value>hadoop-node2.nil.ml:9001</value>
</property>
<!-- 集群名称 -->
<property>
<name>dfs.nameservices</name>
<value>hadoop-cluster</value>
</property>
<!-- namenode节点名称 -->
<property>
<name>dfs.ha.namenodes.hadoop-cluster</name>
<value>nn1,nn2</value>
</property>
<!-- nn1 的rpc通信地址 -->
<property>
<name>dfs.namenode.rpc-address.hadoop-cluster.nn1</name>
<value>hadoop-node1.nil.ml:9000</value>
</property>
<!-- nn2 的 RPC 通信地址 -->
<property>
<name>dfs.namenode.rpc-address.hadoop-cluster.nn2</name>
<value>hadoop-node2.nil.ml:9000</value>
</property>
<!-- nn1的http 通信地址 -->
<property>
<name>dfs.namenode.http-address.hadoop-cluster.nn1</name>
<value>hadoop-node1.nil.ml:50070</value>
</property>
<!-- nn2 的 http 通信地址 -->
<property>
<name>dfs.namenode.http-address.hadoop-cluster.nn2</name>
<value>hadoop-node2.nil.ml:50070</value>
</property>
<!-- hdfs web -->
<property>
<name>dfs.webhdfs.enabled</name>
<value>true</value>
</property>
<!-- 指定 NameNode的edits元数据的共享存储位置也就是JournalNode列表url的配置 -->
<property>
<name>dfs.namenode.shared.edits.dir</name>
<value>qjournal://hadoop-node1.nil.ml:8485;hadoop-node2.nil.ml:8485;hadoop-node3.nil.ml:8485/hadoop-cluster</value>
</property>
<!-- 开启NameNode失败自动切换 -->
<property>
<name>dfs.ha.automatic-failover.enabled</name>
<value>true</value>
</property>
<!-- 指定失败之后的自动切换实现模式 -->
<property>
<name>dfs.client.failover.proxy.provider.hadoop-cluster</name>
<value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value>
</property>
<!-- 配置隔离机制方法,多个机制使用换行分割:每个机制占用一行 -->
<property>
<name>dfs.ha.fencing.methods</name>
<value>
sshfence
shell(/bin/true)
</value>
</property>
<!-- 使用 sshfence隔离机制的时候需要配置ssh免密登录 -->
<property>
<name>dfs.ha.fencing.ssh.private-key-files</name>
<value>/home/hdfs/.ssh/id_rsa</value>
</property>
<!-- 配置sshfence隔离机制超时时间 -->
<property>
<name>dfs.ha.fencing.ssh.connect-timeout</name>
<value>30000</value>
</property>
<property>
<name>ha.failover-controller.cli-check.rpc-timeout.ms</name>
<value>60000</value>
</property>
<!-- block access token -->
<property>
<name>dfs.block.access.token.enable</name>
<value>true</value>
</property>
<property>
<name>dfs.webhdfs.enabled</name>
<value>true</value>
</property>
</configuration>
|
连接地址线程数计算:
这里写了一个python的小程序:
1
2
3
4
5
| import math
num=int(input ("Enter the cluster node count: "))
th=(math.log(num) * 20)
print ("The dfs namnode handler count is ", int(th))
|
关于这个参数的连接:
https://blog.csdn.net/qq_43081842/article/details/102672420
fs.defaultFS:
NameNode地址
hadoop.tmp.dir:
Hadoop临时目录
dfs.namenode.name.dir
:保存FSImage的目录,存放NameNode的metadata
dfs.datanode.data.dir
:保存HDFS数据的目录,存放DataNode的多个数据块
dfs.replication
:HDFS存储的临时备份数量,有两个Worker节点,因此数值为2
编辑mapred-site.xml
配置文件:
1
| vi /opt/hadoop/etc/hadoop/mapred-site.xml
|
内容如下:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
| <?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
<!-- 指定MR使用框架为yarn -->
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
<!-- 指定 MR Jobhistory的地址 -->
<property>
<name>mapreduce.jobhistory.address</name>
<value>hadoop-node1.nil.ml:10020</value>
</property>
<!-- 任务历史服务器的web地址 -->
<property>
<name>mapreduce.jobhistory.webapp.address</name>
<value>hadoop-node1.nil.ml:19888</value>
</property>
</configuration>
|
修改yarn配置文件/opt/hadoop/etc/hadoop/yarn-site.xml
:
1
| vi $HADOOP_HOME/etc/hadoop/yarn-site.xml
|
内容如下:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
| <?xml version="1.0"?>
<configuration>
<!-- 开启MR高可用 -->
<property>
<name>yarn.resourcemanager.ha.enabled</name>
<value>true</value>
</property>
<!-- 指定RM的cluster id -->
<property>
<name> yarn.resourcemanager.cluster-id</name>
<value>yrc</value>
</property>
<!-- 指定 RM的名字 -->
<property>
<name>yarn.resourcemanager.ha.rm-ids</name>
<value>rm1,rm2</value>
</property>
<!-- 分别指定RM的地址 -->
<property>
<name>yarn.resourcemanager.hostname.rm1</name>
<value>hadoop-node1.nil.ml</value>
</property>
<property>
<name>yarn.resourcemanager.webapp.address.rm1</name>
<value>hadoop-node1.nil.ml:8088</value>
</property>
<property>
<name>yarn.resourcemanager.webapp.address.rm2</name>
<value>hadoop-node2.nil.ml:8088</value>
</property>
<property>
<name>yarn.resourcemanager.hostname.rm2</name>
<value>hadoop-node2.nil.ml</value>
</property>
<!-- 指定 zk集群的地址 -->
<property>
<name>hadoop.zk.address</name>
<value>hadoop-node1.nil.ml:2181,hadoop-node2.nil.ml:2181,hadoop-node3.nil.ml:2181</value>
</property>
<!-- 指定 MR 走 shuffle -->
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<property>
<name>yarn.log-aggregation-enable</name>
<value>true</value>
</property>
<property>
<name>yarn.log-aggregation.retain-seconds</name>
<value>86400</value>
</property>
<!-- 启用自动恢复 -->
<property>
<name>yarn.resourcemanager.recovery.enabled</name>
<value>true</value>
</property>
<!-- 指定 resourcemanager的状态信息存储在 zookeeper集群上 -->
<property>
<name>yarn.resourcemanager.store.class</name>
<value>org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore</value>
</property>
<property>
<name>yarn.log-aggregation-enable</name>
<value>true</value>
</property>
<!-- 日志保留时间设置7天 -->
<property>
<name>yarn.log-aggregation.retain-seconds</name>
<value>604800</value>
</property>
<property>
<description>Indicate to clients whether Timeline service is enabled or not.
If enabled, the TimelineClient library used by end-users will post entities
and events to the Timeline server.</description>
<name>yarn.timeline-service.enabled</name>
<value>true</value>
</property>
<property>
<description>The setting that controls whether yarn system metrics is
published on the timeline server or not by RM.</description>
<name>yarn.resourcemanager.system-metrics-publisher.enabled</name>
<value>true</value>
</property>
<property>
<description>Indicate to clients whether to query generic application
data from timeline history-service or not. If not enabled then application
data is queried only from Resource Manager.</description>
<name>yarn.timeline-service.generic-application-history.enabled</name>
<value>true</value>
</property>
</configuration>
|
配置woker节点:
1
| vi /opt/hadoop/etc/hadoop/workers
|
1
2
3
| hadoop-node1.nil.ml
hadoop-node2.nil.ml
hadoop-node3.nil.ml
|
复制文件到各个节点:
1
2
| rsync -avz /opt/hadoop -e ssh hadoop-node2.nil.ml:/opt/
rsync -avz /opt/hadoop -e ssh hadoop-node3.nil.ml:/opt/
|
启动
首先启动 journalnode(所有节点)
1
| hdfs --daemon start journalnode
|
初始化namenode (hadoop-node1)节点:
复制namenode
文件到hadoop-node2节点上:
1
2
| cd /opt/hdfs/namenode
scp -r current hadoop-node2.nil.ml:$PWD
|
格式化 zkfc
启动hdfs
启动yarn
查看yarn状态:
1
| yarn rmadmin -getServiceState rm1
|
查看HDFS HA节点状态:
1
| hdfs haadmin -getServiceState nn1
|
如果发yarn状态不对需要手动启动resourcemanager的时候可以执行:
1
| yarn-daemons.sh start resourcemanager
|
启动MR的任务历史服务器
1
| mr-jobhistory-daemon.sh start historyserver
|
Web访问地址:
HDFS:
- hadoop-node1.nil.ml:50070
- hadoop-node2.nil.ml:50070
yarn
- hadoop-node1.nil.ml:8088
- hadoop-node2.nil.ml:8088
jobhistory
- hadoop-node1.nil.ml:19888
hadoop 服务 systemd service 文件
为了方便管理这里去创建几个systemd file用于去管理服务
hdfs
创建并编辑/lib/systemd/system/hadoop-dfs.service
1
| sudo vi /lib/systemd/system/hadoop-dfs.service
|
内容如下:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
| [Unit]
Description=Hadoop DFS namenode and datanode
After=syslog.target network.target remote-fs.target nss-lookup.target network-online.target
Requires=network-online.target
[Service]
User=hdfs
Group=hdfs
Type=forking
ExecStart=/opt/hadoop/sbin/start-dfs.sh
ExecStop=/opt/hadoop/sbin/stop-dfs.sh
WorkingDirectory=/opt/hadoop/
Environment=JAVA_HOME=/usr/lib/jvm/java-1.8.0-openjdk/
Environment=HADOOP_HOME=/opt/hadoop/
TimeoutStartSec=2min
Restart=on-failure
PIDFile=/tmp/hadoop-hadoop-namenode.pid
[Install]
WantedBy=multi-user.target
|
启动:
1
| sudo systemctl start hadoop-dfs.service
|
查看状态:
1
| sudo systemctl status hadoop-dfs.service
|
开机启动:
1
| sudo systemctl enable hadoop-dfs.service
|
yarn
创建并编辑/lib/systemd/system/hadoop-yarn.service
1
| sudo vi /lib/systemd/system/hadoop-yarn.service
|
内容如下:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
| [Unit]
Description=Hadoop Yarn
After=syslog.target network.target remote-fs.target nss-lookup.target network-online.target
Requires=network-online.target
[Service]
User=hdfs
Group=hdfs
Type=forking
ExecStart=/opt/hadoop/sbin/start-yarn.sh
ExecStop=/opt/hadoop/sbin/stop-yarn.sh
WorkingDirectory=/opt/hadoop/
Environment=JAVA_HOME=/usr/lib/jvm/java-1.8.0-openjdk/
Environment=HADOOP_HOME=/opt/hadoop/
TimeoutStartSec=2min
Restart=on-failure
PIDFile=/tmp/hadoop-hadoop-namenode.pid
[Install]
WantedBy=multi-user.target
|
启动:
1
| systemctl start hadoop-yarn
|
1
| systemctl enable hadoop-yarn
|
jobhistory
HDFS 测试
在hdfs上创建文件夹:
1
2
| hdfs dfs -mkdir /test1
hdfs dfs -mkdir /logs
|
查看:
把系统的/var/log/
下面的所有内容丢在我们创建的/logs/
文件夹下:
1
| hdfs dfs -put /var/log/* /logs/
|
HA验证
在这个章节我们将会测试之前部署hdfs HA是否可以正常使用
干掉active namenode
首先我们干掉active namenode然后看看集群有什么变化
为每个服务编写systemd services 文件
hive
解压hive包:
1
2
3
4
5
|
cd /opt/packatges
sudo tar -xf apache-hive-3.1.2-bin.tar.gz
sudo ln -sf /opt/packages/apache-hive-3.1.2-bin /opt/hive
sudo chown -R hive:hive /opt/hive/
|
修改/etc/profile
文件:
1
2
3
4
| export HIVE_HOME=/opt/hive
export PATH=$HIVE_HOME/bin:$PATH
export HIVE_CONF_DIR=$HIVE_HOME/conf
export HIVE_AUX_JARS_PATH=$HIVE_HOME/lib
|
配置数据库集成:
创建并编辑/etc/yum.repos.d/mysql-57-ce.repo
文件,内容如下:
1
2
3
4
5
| [mysql-57]
name = mysql-57
baseurl = https://mirror.tuna.tsinghua.edu.cn/mysql/yum/mysql57-community-el7/
enable = 1
gpgcheck = 0
|
更新本地的缓存:
安装Mysql
1
| yum install -y mysql-community-client mysql-community-server
|
配置Mysql
启动Mysql服务,然后找到随机生成的那个临时密码,再去修改它:
1
2
3
| systemctl start mysqld
grep "password" /var/log/mysqld.log
2021-03-23T13:33:22.126793Z 1 [Note] A temporary password is generated for root@localhost: iNJhEzzSw2%j
|
这里这个iNJhEzzSw2%j
临时的Mysql密码,我们用这个密码登陆到当前的mysql中:
1
| mysql -uroot -piNJhEzzSw2%j
|
设置新的root密码:
1
2
3
| set global validate_password_policy=0;
set global validate_password_length=6;
ALTER USER 'root'@'localhost' IDENTIFIED BY '123456';
|
创建hive数据库以及对应的账户:
1
2
3
4
| CREATE DATABASE hive DEFAULT CHARACTER SET utf8 DEFAULT COLLATE utf8_general_ci;
CREATE USER 'hive'@'hadoop-node1.nil.ml' IDENTIFIED BY '123456';
GRANT ALL ON hive.* TO 'hive'@'hadoop-node1.nil.ml';
FLUSH PRIVILEGES;
|
将mysqld服务重新启动并加入开机启动项:
1
2
3
| systemctl restart mysqld
systemctl enable mysqld
systemctl status mysqld
|
注意切换到hive用户
配置hive-env.sh
文件:
1
2
| cp -v /opt/hive/conf/hive-env.sh.template /opt/hive/conf/hive-env.sh
vi /opt/hive/conf/hive-env.sh
|
内容如下:
1
2
3
4
5
| export HADOOP_HOME=/opt/hadoop
export HIVE_HOME=/opt/hive
export HADOOP_CONF_DIR=$HADOOP_HOME/etc/hadoop
export HIVE_AUX_JARS_PATH=$HIVE_HOME/lib
export HIVE_CONF_DIR=$HIVE_HOME/conf
|
修改配置文件:
1
| vi /opt/hive/conf/hive-site.xml
|
内容如下
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
| <?xml version="1.0" encoding="UTF-8" standalone="no"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
<!-- 数据库配置 -->
<property>
<name>javax.jdo.option.ConnectionURL</name>
<value>jdbc:mysql://hadoop-node1.nil.ml:3306/hive?createDatabaseIfNotExist=true&useUnicode=true&characterEncoding=UTF-8&useSSL=false</value>
<description>JDBC connect string for a JDBC metastore</description>
</property>
<property>
<name>javax.jdo.option.ConnectionDriverName</name>
<value>com.mysql.jdbc.Driver</value>
<description>Driver class name for a JDBC metastore</description>
</property>
<property>
<name>javax.jdo.option.ConnectionUserName</name>
<value>hive</value>
<description>username to use against metastore database</description>
</property>
<property>
<name>javax.jdo.option.ConnectionPassword</name>
<value>123456</value>
<description>password to use against metastore database</description>
</property>
<!-- hive的工作目录 -->
<property>
<name>hive.exec.local.scratchdir</name>
<value>/opt/hive/tmp/hiveuser</value>
<description>Local scratch space for Hive jobs</description>
</property>
<property>
<name>hive.downloaded.resources.dir</name>
<value>/opt/hive/tmp/${hive.session.id}_resources</value>
<description>Temporary local directory for added resources in the remote file system.</description>
</property>
<property>
<name>hive.querylog.location</name>
<value>/opt/hive/tmp/qrylog</value>
<description>Location of Hive run time structured log file</description>
</property>
<!-- 元数据 -->
<property>
<name>datanucleus.schema.autoCreateAll</name>
<value>true</value>
<description>Auto creates necessary schema on a startup if one doesn't exist. Set this to false, after creating it once.To enable auto create also set hive.metastore.schema.verification=false. Auto creation is not recommended for production use cases, run schematool command instead.</description>
</property>
<property>
<name>hive.metastore.schema.verification</name>
<value>false</value>
<description>
Enforce metastore schema version consistency.
True: Verify that version information stored in is compatible with one from Hive jars. Also disable automatic
schema migration attempt. Users are required to manually migrate schema after Hive upgrade which ensures
proper metastore schema migration. (Default)
False: Warn if the version information stored in metastore doesn't match with one from in Hive jars.
</description>
</property>
<!-- zookeeper -->
<property>
<name>hive.zookeeper.quorum</name>
<value>
hadoop-node1.nil.ml:2181,hadoop-node2.nil.ml:2181,hadoop-node3.nil.ml:2181
</value>
<description>
List of ZooKeeper servers to talk to. This is needed for:
1. Read/write locks - when hive.lock.manager is set to
org.apache.hadoop.hive.ql.lockmgr.zookeeper.ZooKeeperHiveLockManager,
1. When HiveServer2 supports service discovery via Zookeeper.
2. For delegation token storage if zookeeper store is used, if
hive.cluster.delegation.token.store.zookeeper.connectString is not set
1. LLAP daemon registry service
2. Leader selection for privilege synchronizer
</description>
</property>
<property>
<name>hive.server2.support.dynamic.service.discovery</name>
<value>true</value>
<description>Whether HiveServer2 supports dynamic service discovery for its clients. To support this, each instance of HiveServer2 currently uses ZooKeeper to register itself, when it is brought up. JDBC/ODBC clients should use the ZooKeeper ensemble: hive.zookeeper.quorum in their connection string.</description>
</property>
<property>
<name>hive.server2.zookeeper.namespace</name>
<value>hiveserver2_zk</value>
<description>The parent node in ZooKeeper used by HiveServer2 when supporting dynamic service discovery.</description>
</property>
<property>
<name>hive.server2.zookeeper.publish.configs</name>
<value>true</value>
<description>Whether we should publish HiveServer2's configs to ZooKeeper.</description>
</property>
<!-- 日志 -->
<property>
<name>hive.server2.logging.operation.log.location</name>
<value>/opt/hive/tmp/operation_logs</value>
<description>Top level directory where operation logs are stored if logging functionality is enabled</description>
</property>
<property>
<name>hive.server2.thrift.client.user</name>
<value>root</value>
<description>Username to use against thrift client</description>
</property>
<property>
<name>hive.server2.thrift.client.password</name>
<value>changeme</value>
<description>Password to use against thrift client</description>
</property>
<property>
<name>hive.server2.thrift.port</name>
<value>10000</value>
<description>Port number of HiveServer2 Thrift interface when hive.server2.transport.mode is 'binary'.</description>
</property>
<property>
<name>hive.server2.transport.mode</name>
<value>binary</value>
<description>
Expects one of [binary, http].
Transport mode of HiveServer2.
</description>
</property>
<property>
<name>hive.server2.thrift.bind.host</name>
<value>hadoop-node1.nil.ml</value>
<description>Bind host on which to run the HiveServer2 Thrift service.</description>
</property>
<!-- hdfs -->
<property>
<name>hive.exec.scratchdir</name>
<value>/tmp/hive</value>
<description>HDFS root scratch dir for Hive jobs which gets created with write all (733) permission. For each connecting user, an HDFS scratch dir: ${hive.exec.scratchdir}/<username> is created, with ${hive.scratch.dir.permission}.</description>
</property>
<property>
<name>hive.repl.rootdir</name>
<value>/user/hive/repl/</value>
<description>HDFS root dir for all replication dumps.</description>
</property>
<!-- 连接的doas -->
<property>
<name>hive.server2.enable.doAs</name>
<value>FALSE</value>
<description>Setting this property to true will have HiveServer2 execute Hive operations as the user making the calls to it.
</description>
</property>
</configuration>
|
确保hadoop的core-site.xml
配置文件中有以下内容:
1
2
3
4
5
6
7
8
| <property>
<name>hadoop.proxyuser.root.hosts</name>
<value>*</value>
</property>
<property>
<name>hadoop.proxyuser.root.groups</name>
<value>*</value>
</property>
|
创建日志目录:
1
| mkdir -pv /opt/hive/tmp/hiveuser /opt/hive/tmp/qrylog /opt/hive/tmp/operation_logs
|
切换到hdfs
用户
创建hive在hdfs上的目录:
1
2
3
| hadoop fs -mkdir -p /user/hive/repl /user/hive/warehouse /tmp/hive
hadoop fs -chown -R hive /user/hive/repl /user/hive/warehouse
hadoop fs -chmod -R 777 /tmp /user/hive
|
安装jdbc:
1
| yum install -y mysql-connector-java
|
1
2
3
4
5
| cp -v /opt/hive/conf/hive-exec-log4j2.properties.template /opt/hive/conf/hive-exec-log4j2.properties
cp -v /opt/hive/conf/hive-log4j2.properties.template /opt/hive/conf/hive-log4j2.properties
rm -f /opt/hive/lib/guava-19.0.jar
cp -v /opt/hadoop/share/hadoop/common/lib/guava-27.0-jre.jar /opt/hive/lib/
cp -v /usr/share/java/mysql-connector-java.jar /opt/hive/lib/
|
初始化:
1
| $HIVE_HOME/bin/schematool -dbType mysql -initSchema
|
创建并编辑/opt/hive/bin/daemon.sh
:
1
| vi /opt/hive/bin/daemon.sh
|
内容如下:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
| #!/bin/sh
. /etc/profile
if [ $# -ne 2 ] ;then
echo "please input two params,first is (metastore|hiveserver2),second is (start|stop)"
exit 0
fi
if [ "$1" == "metastore" ] ; then
if [ "$2" == "start" ] ; then
echo "now is start metastore"
nohup $HIVE_HOME/bin/hive --service metastore > /var/log/hive/hive-metastore.log 2>&1 &
exit 0
elif [ "$2" == "stop" ] ; then
ps -ef |grep [h]ive-metastore |awk '{print $2}' | xargs kill
echo "-------metastore has stop"
exit 0
else
echo "second param please input 'start' or 'stop'"
exit 0
fi
elif [ "$1" == "hiveserver2" ] ; then
if [ "$2" == "start" ] ; then
echo "now is start hiveserver2"
nohup $HIVE_HOME/bin/hive --service hiveserver2 > /var/log/hive/hiveserver2.log 2>&1 &
exit 0
elif [ "$2" == "stop" ] ; then
ps -ef |grep [h]iveserver |awk '{print $2}' | xargs kill
else
echo "second param please input 'start' or 'stop'"
exit 0
fi
else
echo "first param please input 'metastore' or 'hiveserver2'"
fi
|
添加执行权限:
1
| chmod +x /opt/hive/bin/daemon.sh
|
创建services文件:
hive-hiveserver2.service
创建并编辑:/lib/systemd/system/hive-hiveserver2.service
文件
1
| vi /lib/systemd/system/hive-hiveserver2.service
|
内容如下
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
| [Unit]
Description=metastore
Wants=network-online.target
After=network-online.target
[Service]
Type=forking
User=hive
Group=hadoop
Environment="HADOOP_HOME=/opt/hadoop"
ExecStart=/opt/hive/bin/daemon.sh hiveserver2 start
ExecStop=/opt/hive/bin/daemon.sh hiveserver2 stop
Restart=no
[Install]
WantedBy=multi-user.target
|
hive-metastore.service
创建并编辑/lib/systemd/system/hive-metastore.service
:
1
| vi /lib/systemd/system/hive-metastore.service
|
内容如下:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
| [Unit]
Description=metastore
Wants=network-online.target
After=network-online.target
[Service]
Type=forking
User=hive
Group=hadoop
Environment="HADOOP_HOME=/opt/hadoop"
ExecStart=/opt/hive/bin/daemon.sh metastore start
ExecStop=/opt/hive/bin/daemon.sh metastore stop
Restart=no
[Install]
WantedBy=multi-user.target
|
1
| systemctl daemon-reload
|
创建hive日志的目录:
1
2
| mkdir -pv /var/log/hive
chown -R hive:hadoop /var/log/hive/
|
先启动元数据,后启动hiveserver2
1
2
3
4
| systemctl start hive-metastore.service
systemctl status hive-metastore.service
systemctl start hive-hiveserver2.service
systemctl status hive-hiveserver2.service
|
运行hive:
验证:
1
| $HIVE_HOME/bin/beeline -u jdbc:hive2://hadoop-node1.nil.ml:10000 -n root --color=true --silent=false
|
创建一个数据库试试看:
hbase
解压:
1
2
3
4
| cd /opt/packages/
tar -xf hbase-2.3.5-bin.tar.gz
mv hbase-2.3.5 /opt/hbase
sudo chown -R hbase:hbase /opt/hbase
|
修改环境变量/etc/profile
1
2
| export HBASE_HOME=/opt/hbase
export PATH=$HBASE_HOME/bin:$PATH
|
切换到hbase
用户
vi /opt/hbase/conf/hbase-env.sh
export HBASE_HOME=/opt/hbase
export JAVA_HOME=/usr/lib/jvm/java-1.8.0-openjdk/
export HBASE_CLASSPATH=$HBASE_HOME/conf
export HADOOP_HOME=/opt/hadoop
export HBASE_LOG_DIR=$HBASE_HOME/logs
vi /opt/hbase/conf/hbase-site.xml
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
| <?xml version="1.0"?>
<configuration>
<property>
<name>hbase.rootdir</name>
<value>hdfs://hadoop-cluster/hbase</value>
</property>
<property>
<name>hbase.cluster.distributed</name>
<value>true</value>
</property>
<property>
<name>hbase.master</name>
<value>hadoop-node1.nil.ml:60000</value>
</property>
<property>
<name>hbase.zookeeper.quorum</name>
<value>hadoop-node1.nil.ml,hadoop-node2.nil.ml,hadoop-node3.nil.ml</value>
</property>
</configuration>
|
regionservers
vi /opt/hbase/conf/regionservers
hadoop-node1.nil.ml
hadoop-node2.nil.ml
hadoop-node3.nil.ml
创建日志目录:
1
| mkdir -pv /opt/hbase/logs
|
复制hbase到从节点中
scp -r /opt/hbase hadoop-node2.nil.ml:/opt
scp -r /opt/hbase hadoop-node3.nil.ml:/opt
更改权限:
1
| chown -R hbase:hbase /opt/hbase
|
在hdfs上创建hbase的目录以及授权:
1
2
| hadoop fs -mkdir /hbase
hadoop fs -chown hbase /hbase
|
配置ssh免密
生成密钥:
创建文件夹:
创建并编辑~/.ssh/authorized_keys
文件,内容是你的公钥。
保存退出之后更改文件权限:
1
2
| chmod 700 ~/.ssh
chmod 600 ~/.ssh/authorized_keys
|
验证是否可以正常连接到节点:
1
2
3
4
5
6
7
8
| ssh hbase@hadoop-node1.nil.ml "whoami"
ssh hbase@hadoop-node1.nil.ml "whoami"
ssh hbase@hadoop-node1.nil.ml "whoami"
启动:
```shell
/opt/hbase/bin/start-hbase.sh
|
验证:
Spark
hadoop缺点:
1.表达能力有限(MapReduce)
2.磁盘IO开销大(shuffle)
3.延迟高
spark:
1.Spark的计算模式属于MapReduce,在借鉴Hadoop MapReduce优点的同时很好地解决了MapReduce所面临的问题
2.不局限于Map和Reduce操作,还提供了多种数据集操作类型,编程模型比Hadoop MapReduce更灵活
3.Spark提供了内存计算,可将中间结果放到内存中,对于迭代运算效率更高
4.Spark基于DAG的任务调度执行机制,要优于Hadoop MapReduce的迭代执行机制(函数调用)
使用Hadoop进行迭代计算非常耗资源;
Spark将数据载入内存后,之后的迭代计算都可以直接使用内存中的中间结果作运算,避免了从磁盘中频繁读取数据
Spark包括 Master、Slaves、Driver和每个worker上负责任务执行进程Executor
Master是集群的管理者 Cluster Manager 支持 Standalone Yarn, Mesos
配置构建环境:
- Maven 3.6.3
- Scala 2.12
- Java 8
- shasum
1
2
3
| mkdir -pv tools
cd tools
wget -c https://ftp.wayne.edu/apache/maven/maven-3/3.8.1/binaries/apache-maven-3.8.1-bin.tar.gz
|
1
| wget -c https://mirrors.tuna.tsinghua.edu.cn/apache/spark/spark-3.1.2/spark-3.1.2.tgz
|
去掉hive jar包的支持
构建一个可以运行的发型版:
1
2
3
| export MAVEN_OPTS="-Xmx2g -XX:ReservedCodeCacheSize=1g"
export PATH=$PATH
./dev/make-distribution.sh --name "hadoop3-without-hive" --tgz "-Pyarn,hadoop-provided,hadoop-3.2"
|
spark用户免密
切换到spark
用户:
生产密钥(三个节点都需要):
这一步在hadoop-node1
上操作:
1
| cat .ssh/id_rsa.pub >> .ssh/authorized_keys
|
这一步用于生成.ssh/authorized_keys
文件。
接下来我们添加其他两个公钥
在其他的两个节点上查看生成的公钥然后添加到当前的.ssh/authorized_keys
文件内:
将这个文件内容.ssh/authorized_keys
复制到其他的两个点上。
创建并编辑~/.ssh/authorized_keys
内容就是从hadoop-node1.nil.ml
复制过来的:
1
2
3
| ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQC2tDyregbTpxwyPuTNwQy769G8gs+bd3CuRyneo3HomDHRZnx6vE14aLdHs8k1KK7ko3c3eKZ83zrytKbLv9Eq5zH22kmNG2Xp1fiXMGDex81SZ9qrPI2IXW6Dtk82w8nH8XGs+2BcA71RZWzXGBc+CJfUPnEyhKqsZpTFP8FZko/i8ptb9ShghY614etXNzKy9g0O0s9WD9rdBw/QoOC3xybD/aFfbZP+YFgPZSywT8ThXkdJhDucDS6WG9yvASxUAdXyPkjbdrBh0y/FzF3qKbvEunszWo7I27nXndQ8ew3uIp7p+rfjy6bDtgnyvhvSqXaKQ72umzIz/cvlWE17 spark@hadoop-node1.nil.ml
ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQC14y2KT1a4rBBKFLNH3vB1SVEOcV0nrVTFouMFIppADRMVqB2eEoB5p57jWn7vj+RrXPxFDP/qfoj6fO3M6KyHRg+mx0JJiT+LPNhb5M5tacPN79aDQ3Js/hZJHDWjerv42YdkuOkfozhIB8wAti7Pvc/C6/n2MjJly9PjH+mAC5WQl0QbLJmqTnS1yfmCFVhIYhTF9wS0GmTVHdspEcvXHKiNo0QpEmf1ezNPIcO4V5cMhbfdx2mstmx8OWQQdcZ7zBYDOOx5NpdIB5BxSp2yEDQjOfkEr8uc0QH2DhmTVCHwWS2tL+XYdkJPEuGoqQp4EZopeuFfM1gOZZIJsUTl spark@hadoop-node2.nil.ml
ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQDDqq8HXxd4c8/6dc371PkRWVhAyf239bW7w3FU3Qs3ZMcMWbu2UwqF5hlX/Wr55lH4VkvEkMywS4VZBRnr3mHiE+RpPyNdETpb9TAkAp+lHWpbkVVq+reUIpURKaqLZQiq535okktddNLOpZ0bKX4dAR4Iqs8H7OV+EFcSkoZo8BQzI4tuERlAmbJ6D2Ft4pjmE0ii4br0BjSQtAZcEjrc8JGAvus8sQ45UBVAARbk8zL0ekL67Yn7aA2Bws0UYDyX+iVCLrCBVqC0ftI7S39s6nP+50rjQX95ZTvtuPQk23JObN+FnW1acKmZbBkFgBMsoUTM95f3c8kmfpaX94O3 spark@hadoop-node3.nil.ml
|
更改权限:
1
2
| chmod 700 ~/.ssh
chmod 600 ~/.ssh/authorized_keys
|
验证一下是否成功:
1
2
| ssh hadoop-node2.nil.ml "hostname"
ssh hadoop-node3.nil.ml "hostname"
|
部署spark
1
2
3
4
| cd /opt/packages
tar -xf spark-3.1.2-bin-hadoop3.2.tgz
ln -sf /opt/packages/spark-3.1.2-bin-hadoop3.2 /opt/spark
chown -R spark:spark /opt/spark/
|
在/etc/profile
文件中添加如下内容:
1
2
| export SPARK_HOME=/opt/spark
export PATH=$PATH:$SPARK_HOME/bin
|
修改spark-env.sh
文件:
1
2
| cp -v /opt/spark/conf/spark-env.sh.template /opt/spark/conf/spark-env.sh
vi /opt/spark/conf/spark-env.sh
|
添加内容如下:
1
2
| export JAVA_HOME=/usr/lib/jvm/java-1.8.0-openjdk/
export SPARK_MASTER_HOST=hadoop-node1.nil.ml
|
修改workers
文件:
1
2
| cp -v /opt/spark/conf/workers.template /opt/spark/conf/workers
vi /opt/spark/conf/workers
|
内容如下:
1
2
3
| hadoop-node1.nil.ml
hadoop-node2.nil.ml
hadoop-node3.nil.ml
|
同步配置文件:
1
2
| rsync -avz /opt/spark/ -e ssh hadoop-node2.nil.ml:/opt/spark/
rsync -avz /opt/spark/ -e ssh hadoop-node3.nil.ml:/opt/spark/
|
启动服务:
MasterUI
hadoop-node1.nil.ml:8081
启动salves
1
| /opt/spark/sbin/start-workers.sh
|
测试:
安装python3(每个worker
节点)
1
| yum install python3-devel python3 python3-wheel -y
|
1
| spark-submit --master spark://hadoop-node1.nil.ml:7077 /opt/spark/examples/src/main/python/pi.py
|
配置spark on hive
修改hive-site.xml
添加如下内容:
1
2
3
4
5
| <!-- hive on spark-->
<property>
<name>hive.excution.engine</name>
<value>spark</value>
</property>
|
复制spark的jar包到hive的lib下:
1
2
3
| cp -v /opt/spark/jars/scala-library-2.12.10.jar /opt/hive/lib
cp -v /opt/spark/jars/spark-core_2.12-3.1.2.jar /opt/hive/lib
cp -v /opt/spark/jars/spark-network-common_2.12-3.1.2.jar /opt/hive/lib
|
海豚调度
安装和配置海豚调度
1
| tar -xf apache-dolphinscheduler-1.3.6-bin.tar.gz
|
1
| sudo mv apache-dolphinscheduler-1.3.6-bin /opt/dolphinscheduler-bin
|
创建用户:
1
2
| sudo useradd -m -g sudo dolphinscheduler -s /bin/bash
echo "dolphinscheduler:changeme" | sudo chpasswd
|
修改权限:
1
| sudo chown -R dolphinscheduler /opt/dolphinscheduler-bin
|
切换用户:
1
| sudo su - dolphinscheduler
|
配置ssh免密:
1
2
3
4
5
| su dolphinscheduler;
ssh-keygen -t rsa -P '' -f ~/.ssh/id_rsa
cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys
chmod 600 ~/.ssh/authorized_keys
|
更改默认的数据库密码规则:
1
| show VARIABLES like "%password%";
|
这里可以看到输出的validate_password.policy
字段的值为MEDIUM
中等的密码强度,根据需要更改为不同的值。
这里修改为最低的密码等级(仅用于测试)。
1
| SET PERSIST valicate_password.policy = 0;
|
数据库初始化:
1
2
3
4
| CREATE DATABASE dolphinscheduler DEFAULT CHARACTER SET utf8 DEFAULT COLLATE utf8_general_ci;
CREATE USER 'dolphinscheduler'@'localhost' IDENTIFIED BY 'changeme';
ES ON dolphinscheduler.* TO 'dolphinscheduler'@'localhost';
flush privileges;
|
创建表和导入基础数据
首先安装mysql的连接器:
1
| wget -c https://cdn.mysql.com//Downloads/Connector-J/mysql-connector-java-8.0.26-1.el7.noarch.rpm
|
安装:
1
| sudo yum install -y mysql-connector-java-8.0.26-1.el7.noarch.rpm
|
复制库文件到dolphinscheduler
的lib
目录下:
1
| cp -v /usr/share/java/mysql-connector-java-8.0.24.jar /opt/dolphinscheduler-bin/lib/
|
修改conf
目录下的datasource.properties
中的配置:
1
| vi conf/datasource.properties
|
注释掉 postgresql的部分
取消注释掉 mysql部分的,内容如下:
1
2
3
4
| spring.datasource.driver-class-name=com.mysql.jdbc.Driver
spring.datasource.url=jdbc:mysql://localhost:3306/dolphinscheduler?useUnicode=true&characterEncoding=UTF-8&allowMultiQueries=true
spring.datasource.username=dolphinscheduler
spring.datasource.password=changeme
|
执行script
目录下的数据库脚本:
1
| sh script/create-dolphinscheduler.sh
|
修改运行参数:
1
| vi conf/env/dolphinscheduler_env.sh
|
内容如下:
1
2
3
4
| export HADOOP_HOME=/opt/hadoop
export HADOOP_CONF_DIR=/opt/hadoop/etc/hadoop
export JAVA_HOME=/usr/lib/jvm/java-1.8.0-openjdk/
export PATH=$HADOOP_HOME/bin:$JAVA_HOME/bin:$PATH
|
修改一键部署脚本:
1
| vi conf/config/install_config.conf
|
内容如下:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
| # 这里填 mysql or postgresql
dbtype="mysql"
# 数据库连接地址
dbhost="localhost:3306"
# 数据库名
dbname="dolphinscheduler"
# 数据库用户名,此处需要修改为上面设置的{user}具体值
username="dolphinscheduler"
# 数据库密码, 如果有特殊字符,请使用\转义,需要修改为上面设置的{password}具体值
password="changeme"
#Zookeeper地址,单机本机是localhost:2181,记得把2181端口带上
zkQuorum="hadoop-node1.nil.ml:2181,hadoop-node2.nil.ml:2181,hadoop-node3.nil.ml:2181"
#将DS安装到哪个目录,如: /opt/soft/dolphinscheduler,不同于现在的目录
installPath="/opt/dolphinscheduler"
#使用哪个用户部署,使用第3节创建的用户
deployUser="dolphinscheduler"
# 邮件配置,以qq邮箱为例
# 邮件协议
mailProtocol="SMTP"
# 邮件服务地址
mailServerHost="smtp.qq.com"
# 邮件服务端口
mailServerPort="25"
# mailSender和mailUser配置成一样即可
# 发送者
mailSender="xxx@qq.com"
# 发送用户
mailUser="xxx@qq.com"
# 邮箱密码
mailPassword="xxx"
# TLS协议的邮箱设置为true,否则设置为false
starttlsEnable="true"
# 开启SSL协议的邮箱配置为true,否则为false。注意: starttlsEnable和sslEnable不能同时为true
sslEnable="false"
# 邮件服务地址值,参考上面 mailServerHost
sslTrust="smtp.qq.com"
# 业务用到的比如sql等资源文件上传到哪里,可以设置:HDFS,S3,NONE,单机如果想使用本地文件系统,请配置为HDFS,因为HDFS支持本地文件系统;如果不需要资源上传功能请选择NONE。强调一点:使用本地文件系统不需要部署hadoop
resourceStorageType="HDFS"
# 这里以保存到本地文件系统为例
#注:但是如果你想上传到HDFS的话,NameNode启用了HA,则需要将hadoop的配置文件core-site.xml和hdfs-site.xml放到conf目录下,本例即是放到/opt/dolphinscheduler/conf下面,并配置namenode cluster名称;如果NameNode不是HA,则修改为具体的ip或者主机名即可
defaultFS="file:///hadoop-cluster/dolphinscheduler" #hdfs://{具体的ip/主机名}:8020
# 如果没有使用到Yarn,保持以下默认值即可;如果ResourceManager是HA,则配置为ResourceManager节点的主备ip或者hostname,比如"192.168.xx.xx,192.168.xx.xx";如果是单ResourceManager请配置yarnHaIps=""即可
# 注:依赖于yarn执行的任务,为了保证执行结果判断成功,需要确保yarn信息配置正确。
yarnHaIps="hadoop-node1.nil.ml,hadoop-node2.nil.ml,hadoop-node3.nil.ml"
# 如果ResourceManager是HA或者没有使用到Yarn保持默认值即可;如果是单ResourceManager,请配置真实的ResourceManager主机名或者ip
singleYarnIp="yrc"
# 资源上传根路径,支持HDFS和S3,由于hdfs支持本地文件系统,需要确保本地文件夹存在且有读写权限
resourceUploadPath="/data/dolphinscheduler/"
# 具备权限创建resourceUploadPath的用户
hdfsRootUser="hdfs"
#在哪些机器上部署DS服务,本机选localhost
ips="localhost"
#ssh端口,默认22
sshPort="22"
#master服务部署在哪台机器上
masters="localhost"
#worker服务部署在哪台机器上,并指定此worker属于哪一个worker组,下面示例的default即为组名
workers="localhost:default"
#报警服务部署在哪台机器上
alertServer="localhost"
#后端api服务部署在在哪台机器上
apiServers="localhost"
|
创建目录:
1
2
| sudo mkdir -pv /data/dolphinscheduler
sudo chown -R dolphinscheduler:dolphinscheduler /data/dolphinscheduler
|
在部署之前先切换一下默认的 /bin/sh
连接位置:
1
2
| sudo rm /bin/sh
sudo ln -s /usr/bin/bash /bin/sh
|
一键部署:
登录地址:hadoop-node1.nil.ml:12345/dolphinscheduler
默认用户:admin
默认密码:dolphinscheduler123
监控
Prometheus
LogicMonitor
Dynatrace
日志处理
后记
参考链接
(How to Install and Configure Hadoop on Ubuntu 20.04)[https://tecadmin.net/install-hadoop-on-ubuntu-20-04/]
(Configuring Ports)[https://ambari.apache.org/1.2.3/installing-hadoop-using-ambari/content/reference_chap2.html]
https://developpaper.com/building-hadoop-high-availability-cluster-based-on-zookeeper/
https://tutorials.freshersnwo.com/hadoop-tutorial/hadoop-high-availability/
https://www.xenonstack.com/insights/apache-zookeeper-security/
https://zookeeper.apache.org/doc/r3.6.0/zookeeperAdmin.html#sc_authOptions
https://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/SecureContainer.html
https://docs.confluent.io/platform/current/security/zk-security.html
https://docs.cloudera.com/runtime/7.2.0/zookeeper-security/topics/zookeeper-configure-client-shell-kerberos-authentication.html
https://www.xenonstack.com/insights/apache-zookeeper-security/
https://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-common/SecureMode.html
https://zhuanlan.zhihu.com/p/99398378