一、环境及准备

集群环境:

 

软件版本:

部署前操作:

关闭防火墙,关闭selinux(生产环境按需关闭或打开)
同步服务器时间,选择公网ntpd服务器或者自建ntpd服务器
[root@es1 ~]# crontab -l #为了方便直接使用公网服务器
#update time
*/5 * * * *  /usr/bin/rdate -s time-b.nist.gov &>/dev/null

二、zookeeper集群安装配置

1.安装jvm依赖环境(三台)

安装JDK

[root@node01 ~]# rpm -ivh jdk1.8.0_162-x64.rpm  #为了以后升级麻烦直接安装1.8
Preparing...                ########################################### [100%]
   1:jdk1.8.0_162           ########################################### [100%]

设置Java环境

[root@node01 ~]# cat /etc/profile.d/java.sh  #编辑Java环境配置文件
export JAVA_HOME=/usr/java/latest
export CLASSPATH=$JAVA_HOME/lib/tools.jar
export PATH=$JAVA_HOME/bin:$PATH
[root@node01 ~]# . /etc/profile.d/java.sh  
[root@node01 ~]# java -version   #确认配置
java version "1.8.0_162"
Java(TM) SE Runtime Environment (build 1.8.0_162-b12)
Java HotSpot(TM) 64-Bit Server VM (build 25.162-b12, mixed mode)
2.安装配置zookeeper
[root@node01 ~]#wget http://mirror.bit.edu.cn/apache/zookeeper/zookeeper-3.4.13/zookeeper-3.4.13.tar.gz
[root@node01 ~]#tar xf zookeeper-3.4.13.tar.gz -C /usr/local
[root@node01 ~]#cd /usr/local
[root@node01 local]#ln -sv zookeeper-3.4.13 zookeeper
[root@node01 local]#cd zookeeper/conf
[root@node01 conf]# cp  zoo_sample.cfg  zoo.cfg
[root@node01 conf]# vim zoo.cfg
tickTime=2000
initLimit=10
syncLimit=5
dataDir=/Data/zookeeper
clientPort=2181
server.1=172.16.150.154:2888:3888
server.2=172.16.150.155:2888:3888
server.3=172.16.150.156:2888:3888
#配置参数说明:
tickTime:客户端与服务器或者服务器与服务器之间每个tickTime时间就会发送一次心跳。通过心跳不仅能够用来监听机器的工作状态,还可以通过心跳来控制Flower跟Leader的通信时间,默认2秒
initLimit:集群中的follower服务器(F)与leader服务器(L)之间初始连接时能容忍的最多心跳数(tickTime的数量)。
syncLimit:集群中flower服务器(F)跟leader(L)服务器之间的请求和答应最多能容忍的心跳数。   
dataDir:该属性对应的目录是用来存放myid信息跟一些版本,日志,跟服务器唯一的ID信息等。
clientPort:客户端连接的接口,客户端连接zookeeper服务器的端口,zookeeper会监听这个端口,接收客户端的请求访问!这个端口默认是2181。
service.N=YYY:A:B
N:代表服务器编号(也就是myid里面的值)
YYY:服务器地址
A:表示 Flower 跟 Leader的通信端口,简称服务端内部通信的端口(默认2888)
B:表示 是选举端口(默认是3888)

创建zookeeper所需要的目录和myid文件

[root@node01 conf]# mkdir -pv /Data/zookeeper 
mkdir: 已创建目录 "/Data"
mkdir: 已创建目录 "/Data/zookeeper"
[root@node01 conf]# echo "1" > /Data/zookeeper/myid #myid文件,里面的内容为数字,用于标识主机,如果这个文件没有的话,zookeeper无法启动

其他节点配置相同,除以下配置:

echo "x" > /Data/zookeeper/myid #唯一
3.启动zookeeper(三台)
[root@node01 zookeeper]# cd /usr/local/zookeeper/bin
[root@node01 bin]# ./zkServer.sh start
ZooKeeper JMX enabled by default
Using config: /usr/local/zookeeper/bin/../conf/zoo.cfg
Starting zookeeper ... taSTARTED
[root@node01 bin]# tailf zookeeper.out 
2019-02-13 14:05:28,088 [myid:] - INFO  [main:QuorumPeerConfig@136] - Reading configuration from: /usr/local/zookeeper/bin/../conf/zoo.cfg
2019-02-13 14:05:28,102 [myid:] - INFO  [main:QuorumPeer$QuorumServer@184] - Resolved hostname: 172.16.150.154 to address: /172.16.150.154
2019-02-13 14:05:28,102 [myid:] - INFO  [main:QuorumPeer$QuorumServer@184] - Resolved hostname: 172.16.150.156 to address: /172.16.150.156
2019-02-13 14:05:28,103 [myid:] - INFO  [main:QuorumPeer$QuorumServer@184] - Resolved hostname: 172.16.150.155 to address: /172.16.150.155
2019-02-13 14:05:28,103 [myid:] - INFO  [main:QuorumPeerConfig@398] - Defaulting to majority quorums
2019-02-13 14:05:28,108 [myid:1] - INFO  [main:DatadirCleanupManager@78] - autopurge.snapRetainCount set to 3
2019-02-13 14:05:28,108 [myid:1] - INFO  [main:DatadirCleanupManager@79] - autopurge.purgeInterval set to 0
2019-02-13 14:05:28,108 [myid:1] - INFO  [main:DatadirCleanupManager@101] - Purge task is not scheduled.
2019-02-13 14:05:28,119 [myid:1] - INFO  [main:QuorumPeerMain@130] - Starting quorum peer
2019-02-13 14:05:28,128 [myid:1] - INFO  [main:ServerCnxnFactory@117] - Using org.apache.zookeeper.server.NIOServerCnxnFactory as server connection factory
2019-02-13 14:05:28,134 [myid:1] - INFO  [main:NIOServerCnxnFactory@89] - binding to port 0.0.0.0/0.0.0.0:2181
2019-02-13 14:05:28,144 [myid:1] - INFO  [main:QuorumPeer@1158] - tickTime set to 2000
2019-02-13 14:05:28,144 [myid:1] - INFO  [main:QuorumPeer@1204] - initLimit set to 10
2019-02-13 14:05:28,144 [myid:1] - INFO  [main:QuorumPeer@1178] - minSessionTimeout set to -1
2019-02-13 14:05:28,144 [myid:1] - INFO  [main:QuorumPeer@1189] - maxSessionTimeout set to -1
2019-02-13 14:05:28,151 [myid:1] - INFO  [main:QuorumPeer@1467] - QuorumPeer communication is not secured!
2019-02-13 14:05:28,153 [myid:1] - INFO  [main:QuorumPeer@1496] - quorum.cnxn.threads.size set to 20
2019-02-13 14:05:28,196 [myid:1] - INFO  [ListenerThread:QuorumCnxManager$Listener@736] - My election bind port: /172.16.150.154:3888
........

zookeeper服务检查

[root@node01 bin]#  netstat -nlpt | grep -E "2181|2888|3888"
tcp        0      0 0.0.0.0:2181            0.0.0.0:*               LISTEN      6242/java           
tcp        0      0 172.16.150.154:3888     0.0.0.0:*               LISTEN      6242/java 
[root@node02 ~]#  netstat -nlpt | grep -E "2181|2888|3888"
tcp        0      0 0.0.0.0:2181            0.0.0.0:*               LISTEN      5197/java           
tcp        0      0 172.16.150.155:3888     0.0.0.0:*               LISTEN      5197/java  
[root@node03 ~]#  netstat -nlpt | grep -E "2181|2888|3888"
tcp        0      0 0.0.0.0:2181            0.0.0.0:*               LISTEN      5304/java           
tcp        0      0 172.16.150.156:2888     0.0.0.0:*               LISTEN      5304/java   #哪台是leader,那么他就拥有2888端口,可以看到目前node3节点为leader
tcp 0 0 172.16.150.156:3888 0.0.0.0:* LISTEN 5304/java

测试服务器是否正常

[root@node01 bin]# yum install telnet nc -y
[root@node01 bin]# telnet 172.16.150.154 2181
Trying 172.16.150.154...
Connected to 172.16.150.154.
Escape character is '^]'.
exit
Connection closed by foreign host.
[root@node01 bin]# echo "stat"|nc 172.16.150.154 2181  #conf 可以显示配置信息,cons可以显示所有客户端连接的详细信息,mntr命令比stat命令更详细
Zookeeper version: 3.4.13-2d71af4dbe22557fda74f9a9b4309b15a7487f03, built on 06/29/2018 04:05 GMT
Clients:
 /172.16.150.154:54989[0](queued=0,recved=1,sent=0)

Latency min/avg/max: 0/0/0
Received: 1
Sent: 0
Connections: 1
Outstanding: 0
Zxid: 0x1000000d4
Mode: follower
Node count: 138

连接zookeeper

[root@node01 bin]# ./zkCli.sh  -server 172.16.150.154:2181
Connecting to 172.16.150.154:2181
2019-02-13 14:25:24,060 [myid:] - INFO  [main:Environment@100] - Client environment:zookeeper.version=3.4.13-2d71af4dbe22557fda74f9a9b4309b15a7487f03, built on 06/29/2018 04:05 GMT
....
[zk: 172.16.150.154:2181(CONNECTED) 0] h  #查看命令帮助
ZooKeeper -server host:port cmd args
    stat path [watch]
    set path data [version]
    ls path [watch]
    delquota [-n|-b] path
    ls2 path [watch]
    setAcl path acl
    setquota -n|-b val path
    history 
    redo cmdno
    printwatches on|off
    delete path [version]
    sync path
    listquota path
    rmr path
    get path [watch]
    create [-s] [-e] path data acl
    addauth scheme auth
    quit 
    getAcl path
    close 
    connect host:port
[zk: 172.16.150.154:2181(CONNECTED) 1] quit #退出

设置jconsole连接zookeeper 

[root@node01 bin]# vim zkServer.sh #修改54行,172.16.150.154是本机的ip地址,8899是jconsole的连接地址,关闭ssl和认证
 ZOOMAIN="-Dcom.sun.management.jmxremote -Dcom.sun.management.jmxremote.local.only=$JMXLOCALONLY -Djava.rmi.server.hostname=172.16.150.154     -Dcom.sun.management.jmxremote.port=8899 -Dcom.sun.management.jmxremote.ssl=false -Dcom.sun.management.jmxremote.authenticate=false org.apache    .zookeeper.server.quorum.QuorumPeerMain"

./zkServer.sh stop && ./zkServer.sh start #重启服务,使用jconsole连接zookeeper服务器,选择远程连接,输入172.16.150.154:8899 即可

#登录jconsole

 

zookeeper开启超级用户   #关于zookeeper ACL权限请参考官方文档

当设置了znode权限,但是密码忘记了怎么办?如果忘记了该子节点的授权用户名还有密码。这里是比较蛋疼的事情。由于我们基本上找不到because在base64反编码后再sha1反编码后的样子,所以基本上这个节点的控制权可以说是失去了。还好Zookeeper提供了超级管理员机制。

[root@node01 bin]# cd /usr/local/zookeeper/lib/
[root@node01 lib]# java -cp ../zookeeper-3.4.13.jar:./log4j-1.2.17.jar:./slf4j-api-1.7.25.jar:./slf4j-log4j12-1.7.25.jar org.apache.zookeeper.server.auth.DigestAuthenticationProvider super:super
super:super->super:gG7s8t3oDEtIqF6DM9LlI/R+9Ss=   #生成密文
[root@node01 lib]# vim ../bin/zkServer.sh SUPER_ACL="-Dzookeeper.DigestAuthenticationProvider.superDigest=super:gG7s8t3oDEtIqF6DM9LlI/R+9Ss="

#添加以上标记的内容

验证用户是否有效

[root@node01 lib]# cd ../bin/
[root@node01 bin]# ./zkServer.sh stop #修改配置文件后重启服务
ZooKeeper JMX enabled by default
Using config: /usr/local/zookeeper/bin/../conf/zoo.cfg
Stopping zookeeper ... STOPPED
[root@node01 bin]# ./zkServer.sh start
ZooKeeper JMX enabled by default
Using config: /usr/local/zookeeper/bin/../conf/zoo.cfg
Starting zookeeper ... STARTED
[root@node01 bin]# ./zkCli.sh -server 172.16.150.154
......
[zk: 172.16.150.154(CONNECTED) 0] addauth digest super:super  #对之前添加的用户进行认证
[zk: 172.16.150.154(CONNECTED) 1] quit

三、kafka集群安装

kafka同样依赖Java环境,由于和zookeeper在相同的机器上,之前已经安装过了,所有可以直接跳过Java环境安装

1.安装kafka
[root@node01 ~]#wget http://mirrors.tuna.tsinghua.edu.cn/apache/kafka/2.0.1/kafka_2.11-2.0.1.tgz
[root@node01 ~]#tar xf kafka_2.11-2.0.1.tgz -C /usr/local
[root@node01 ~]#cd /usr/local
[root@node01 local]# ln -sv kafka_2.11-2.0.1 kafka
[root@node01 local]# cd kafka/config/
[root@node01 config]#cp server.properties server.properties-bak
[root@node01 config]# grep "^[a-Z]" server.properties 
broker.id=1  #唯一
listeners=PLAINTEXT://172.16.150.154:9092  #修改为本机地址
num.network.threads=3
num.io.threads=8
socket.send.buffer.bytes=102400
socket.receive.buffer.bytes=102400
socket.request.max.bytes=104857600
log.dirs=/Data/kafka-logs #数据目录,kafka-logs会自动采集
num.partitions=3
num.recovery.threads.per.data.dir=1
offsets.topic.replication.factor=1
transaction.state.log.replication.factor=1
transaction.state.log.min.isr=1
log.retention.hours=168
log.segment.bytes=1073741824
log.retention.check.interval.ms=300000
zookeeper.connect=172.16.150.154:2181,172.16.150.155:2181,172.16.150.156:2181  #zokeeper集群地址,以","为分割
zookeeper.connection.timeout.ms=6000
group.initial.rebalance.delay.ms=0

其他节点配置相同,除以下内容:

broker.id=1  #唯一
listeners=PLAINTEXT://172.16.150.154:9092  #修改为本机地址

启动服务

[root@node01 config]# cd ../bin
[root@node01 bin]# 
./kafka-server-start.sh -daemon ../config/server.properties #后台运行

验证服务是否正常

登录zookeeper验证:

[zk: 172.16.150.154(CONNECTED) 5] get  /brokers/ids/1 #查看节点broker id为1的信息
{"listener_security_protocol_map":{"PLAINTEXT":"PLAINTEXT"},"endpoints":["PLAINTEXT://172.16.150.154:9092"],"jmx_port":-1,"host":"172.16.150.154","timestamp":"1549953312989","port":9092,"version":4}
cZxid = 0x10000002e
ctime = Tue Feb 12 14:35:13 CST 2019
mZxid = 0x10000002e
mtime = Tue Feb 12 14:35:13 CST 2019
pZxid = 0x10000002e
cversion = 0
dataVersion = 0
aclVersion = 0
ephemeralOwner = 0x10077feb7bc0001
dataLength = 198
numChildren = 0

创建topic验证

#154上创建一个生产者
[root@node01 ~]# cd /usr/local/kafka/bin/
[root@node01 bin]# ./kafka-topics.sh --create --zookeeper 172.16.150.154:2181 --replication-factor 1 --partitions 1 --topic Test
Created topic "Test".
[root@node01 bin]# ./kafka-console-producer.sh --broker-list 172.16.150.154:9092 --topic Test
#其他服务器上创建一个消费者
[root@node02 ~]# cd /usr/local/kafka/bin/
[root@node02 bin]# ./kafka-console-consumer.sh --bootstrap-server 172.16.150.155:9092 --topic Test --from-beginning
#启动成功后,在154上输入任意内容,在另一台机器上查看是否会同步显示

四、zookeeper及kafka监控工具

1.zookeeper监控工具(没有安装过,有需要请参考官方文档)
zookeeper监控工具地址:https://github.com/soabase/exhibitor
2.kafka监控工具

1)KafkaOffsetMonitor

[root@node01 ~]#mkdir KafkaMonitor
[root@node01 ~]#cd KafkaMonitor/
[root@node01 ~]#wget https://github.com/quantifind/KafkaOffsetMonitor/releases/download/v0.2.1/KafkaOffsetMonitor-assembly-0.2.1.jar
[root@node01 ~]#nohup java -cp KafkaOffsetMonitor-assembly-0.2.0.jar  com.quantifind.kafka.offsetapp.OffsetGetterWeb --zk 172.16.150.154:2181,172.16.150.155:2181,172.16.150.156:2181 -port 8088 --refresh 5.seconds --retain 1.days & 

访问(由于测试环境没有数据,所有我直接登录生产环境来演示):

查看曾经消费者的情况

查看其中任意一个消费者信息

主意lag字段,表示是否有延迟

查看topic

 

2)kafka-manager

[root@node01 ~]# unzip kafka-manager-1.3.3.7.zip  #直接使用已经编译完成的软件包(链接: https://pan.baidu.com/s/12sswyPo7-e9R3mZQ3ba-dA 提取码: jz6s)
[root@node01 ~]# cd kafka-manager-1.3.3.7
[root@node01 ~]# cd conf/
[root@node01 ~]# vim application.conf
kafka-manager.zkhosts="172.16.150.154:2181,172.16.150.155:2181,172.16.150.156:2181"  #填写zookeeper服务器地址和端口
[root@node01 ~]#cd ../bin/
[root@node01 ~]# ./kafka-manager -Dconfig.file=../conf/application.conf -Dhttp.port=8888  #8888表示监听端口,启动后直接访问

#kafka-manager安装需要编译,并且过程复杂、成功率低建议使用其他人已经编译过得直接使用

3)kafka eagle

未实验,听说不错有兴趣的可以学习一下

作为一个初学者,有很多地方都没有理解,写的比较简单,望海涵!

 

参考文档:

https://zookeeper.apache.org/doc/r3.4.13/zookeeperAdmin.html

https://blog.csdn.net/pdw2009/article/details/73794525

https://blog.csdn.net/lizhitao/article/details/25667831

https://www.cnblogs.com/dadonggg/p/8242682.html

https://www.cnblogs.com/dadonggg/p/8205302.html

“一劳永逸” 的话,有是有的,而 “一劳永逸” 的事却极少
来源:https://www.cnblogs.com/panwenbin-logs/p/10369402.html