spark HA集群安装

Last updated on January 17, 2025 am

🧙 Questions

安装spark(HA)集群模式 3.4.0 版本

部署环境 centOS 7.9 64位

服务器个数最好是奇数,isxcode-main1和isxcode-main2为主节点,双节点多活。
注意:节点切换可能需要1到2分钟,比较久。

hostname 公网ip 内网ip CPU 内存
isxcode-main1 39.98.65.13 172.16.215.101 8C 16GB
isxcode-main2 47.92.252.139 172.16.215.100 8C 16GB
isxcode-node1 39.98.76.185 172.16.215.97 4C 8GB
isxcode-node2 39.98.77.191 172.16.215.96 4C 8GB
isxcode-node3 47.92.217.66 172.16.215.95 4C 8GB

☄️ Ideas

配置主节点免密操作 (两个主节点执行)
ssh-keygen
ssh-copy-id ispong@isxcode-main1
ssh-copy-id ispong@isxcode-main2
ssh-copy-id ispong@isxcode-node1
ssh-copy-id ispong@isxcode-node2
ssh-copy-id ispong@isxcode-node3
安装zookeeper集群

参考文档 zookeeper 集群安装

下载spark安装包(一台下载即可)
nohup wget https://archive.apache.org/dist/spark/spark-3.4.0/spark-3.4.0-bin-hadoop3.tgz >> download_spark.log 2>&1 &  
tail -f download_spark.log
分发安装包
scp spark-3.4.0-bin-hadoop3.tgz ispong@isxcode-main2:~/
scp spark-3.4.0-bin-hadoop3.tgz ispong@isxcode-node1:~/
scp spark-3.4.0-bin-hadoop3.tgz ispong@isxcode-node2:~/
scp spark-3.4.0-bin-hadoop3.tgz ispong@isxcode-node3:~/
解压并安装spark(每台都要执行)
sudo mkdir -p /data/spark 
sudo chown -R ispong:ispong /data/spark
tar -vzxf spark-3.4.0-bin-hadoop3.tgz -C /data/spark
sudo ln -s /data/spark/spark-3.4.0-bin-hadoop3 /opt/spark
sudo vim /etc/profile
export SPARK_HOME=/opt/spark 
export PATH=$PATH:$SPARK_HOME/bin:$SPARK_HOME/sbin
source /etc/profile
修改spark配置文件 (配置一台即可)
cp /opt/spark/conf/spark-env.sh.template /opt/spark/conf/spark-env.sh
vim /opt/spark/conf/spark-env.sh
# 配置spark服务端口号
export SPARK_MASTER_PORT=7077
# 配置spark web ui访问端口号
export SPARK_MASTER_WEBUI_PORT=8081
export JAVA_HOME=/usr/lib/jvm/java-1.8.0-openjdk
# 配置zookeeper支持
export SPARK_DAEMON_JAVA_OPTS="-Dspark.deploy.recoveryMode=ZOOKEEPER \
                               -Dspark.deploy.zookeeper.url=isxcode-main1:2181,isxcode-main2:2181,isxcode-node1:2181,isxcode-node2:2181,isxcode-node3:2181 \
                               -Dspark.deploy.zookeeper.dir=/spark"
cp /opt/spark/conf/workers.template /opt/spark/conf/workers
vim /opt/spark/conf/workers

注意:根据情况,可以配置所有的节点

isxcode-node1
isxcode-node2
isxcode-node3
分发配置
scp -r /opt/spark/conf ispong@isxcode-main2:/opt/spark/
scp -r /opt/spark/conf ispong@isxcode-node1:/opt/spark/
scp -r /opt/spark/conf ispong@isxcode-node2:/opt/spark/
scp -r /opt/spark/conf ispong@isxcode-node3:/opt/spark/
配置spark自启 (主节点配置即可)

启动master节点

sudo vim /usr/lib/systemd/system/spark.service
  • Master
[Unit]
Description=Spark Service
After=network.target zookeeper.service

[Service]
Environment=JAVA_HOME=/usr/lib/jvm/java-1.8.0-openjdk
Type=forking
WorkingDirectory=/data/spark
PermissionsStartOnly=true
ExecStartPre=/bin/sleep 10
ExecStart=/opt/spark/sbin/start-master.sh
ExecStop=/opt/spark/sbin/stop-master.sh
ExecReload=/opt/spark/sbin/start-master.sh
KillMode=none
Restart=always
User=ispong
Group=ispong

[Install]
WantedBy=multi-user.target

启动所有的节点

sudo vim /usr/lib/systemd/system/spark-workers.service
[Unit]
Description=Spark Works Service
After=network.target spark.service

[Service]
Environment=JAVA_HOME=/usr/lib/jvm/java-1.8.0-openjdk
Type=forking
WorkingDirectory=/data/spark
PermissionsStartOnly=true
ExecStartPre=/bin/sleep 10
ExecStart=/opt/spark/sbin/start-workers.sh
ExecStop=/opt/spark/sbin/stop-workers.sh
ExecReload=/opt/spark/sbin/start-workers.sh
KillMode=none
Restart=always
User=ispong
Group=ispong

[Install]
WantedBy=multi-user.target
设置开机自启 (主节点执行即可)
sudo systemctl daemon-reload
sudo systemctl enable spark
sudo systemctl enable spark-workers
访问地址

spark HA集群安装
https://ispong.isxcode.com/hadoop/spark/spark HA集群安装/
Author
ispong
Posted on
June 2, 2023
Licensed under