AIOps 一场颠覆传统运维的盛筵
941
2022-10-02
StarRocks 运维工具 StarGo
注:本文主要内容均来源 StarRocks 官网 https://docs.starrocks.com/zh-cn/main/administration/stargo
StarGo 是一个用于管理多个 StarRocks 集群的命令行工具。通过 StarGo,您可以使用简单的命令行实现多集群的部署、查看、升级、启动与停止等操作。
StarGo 使用 go 开发,github 地址 https://github.com/wangtianyi2004/starrocks-controller
部署 StarGo
在当前用户路径下下载并解压 StarGo 二进制安装包。
wget https://raw.githubusercontent.com/wangtianyi2004/starrocks-controller/main/stargo-pkg.tar.gztar -xzvf stargo-pkg.tar.gz
安装包包含以下文件。
stargo:StarGo 二进制文件,无需安装。deploy-template.yaml:部署配置文件模板。repo.yaml:指定 StarRocks 安装包下载库的配置文件。
说明:您可以在 http://cdn-thirdparty.starrocks.com 地址下获取相应的版本的安装 index 文件以及安装包。
StarGo 支持的版本参见:http://starrocks-thirdparty.oss-cn-zhangjiakou.aliyuncs.com/packageVersion.list
目前支持版本如下:
v2.0.1v2.1.3v2.1.6v2.2.0v2.2.1v2.2.2v2.2.3v2.3.0
部署集群
前提条件
待部署集群至少需要一个中控机节点和三个部署机节点,所有节点可以混合部署于同一台机器。中控机上需部署 StarGo。中控机与部署机间需创建 SSH 互信。
以下示例创建了中控机 sr-dev@r0 与部署机 starrocks@r1、starrocks@r2 以及 starrocks@r3 间的 SSH 互信。
创建 sr-dev@r0 到 starrocks@r1、r2、r3 的 ssh 互信。
[sr-dev@r0 ~]$ ssh-keygen[sr-dev@r0 ~]$ ssh-copy-id starrocks@r1[sr-dev@r0 ~]$ ssh-copy-id starrocks@r2[sr-dev@r0 ~]$ ssh-copy-id starrocks@r3## 验证 sr-dev@r0 到 starrocks@r1、r2、r3 的 ssh 互信。[sr-dev@r0 ~]$ ssh starrocks@r1 date[sr-dev@r0 ~]$ ssh starrocks@r2 date[sr-dev@r0 ~]$ ssh starrocks@r3 date
创建配置文件
根据以下 YAML 模板,创建部署 StarRocks 集群的拓扑文件。具体配置项参考参数配置。
[starrocks@bigdata11 stargo-pkg]$ cat deploy-star3.yaml global: user: "starrocks" ssh_port: 22fe_servers: - host: xx.xx.xx.229 ssh_port: 22 http_port: 38030 rpc_port: 39020 query_port: 39030 edit_log_port: 39010 deploy_dir: /opt/StarRocks1/fe meta_dir: /data/starrocks1/fe/meta log_dir: /data/starrocks1/fe/log priority_networks: xx.xx.xx.229 config: sys_log_level: "INFO"be_servers: - host: xx.xx.xx.229 ssh_port: 22 be_port: 39060 webserver_port: 38040 heartbeat_service_port: 39050 brpc_port: 38060 deploy_dir : /opt/StarRocks1/be storage_dir: /data/starrocks1/be/storage log_dir: /data/starrocks1/be/log priority_networks: xx.xx.xx.229/24 config: create_tablet_worker_count: 3
注:服务器上已经有一套 StarRocks,再部署一套单机的,修改了下程序、端口
创建部署目录(可选)
如果您在配置文件中设定的部署路径不存在,且您有创建该路径的权限,StarGo 将根据配置文件自动创建部署目录。如果路径已存在,请确保您有在该路径下拥有写入的权限。您也可以通过以下命令,在各部署节点分别创建部署路径。
在 FE 节点安装目录下上创建 meta 路径。
mkdir -p opt/StarRocks1/be
在 BE 节点安装目录下上创建 storage 路径。
mkdir -p data/starrocks1/be/storage
注意: 请确保以上创建的路径与配置文件中的 meta_dir 和 storage_dir 相同。
部署 StarRocks
通过以下命令部署 StarRocks 集群。
./stargo cluster deploy
参数 | 描述 |
---|---|
cluster_name | 创建的集群名 |
version | StarRocks 的版本 |
topology_file | 配置文件名 |
创建成功后,集群将会自动启动。当返回 beStatus 和feStatus 为 true 时,集群部署启动成功。
示例:
[starrocks@bigdata11 stargo-pkg]$ ./stargo cluster deploy star3 v2.2.2 deploy-star3.yaml[20220812-113235 OUTPUT] Deploy cluster [clusterName = star3, clusterVersion = v2.2.2, metaFile = deploy-star3.yaml][20220812-113236 OUTPUT] PRE CHECK DEPLOY ENV:PreCheck FE:server id ssh auth meta dir deploy dir http port rpc port query port edit log port open files count-------------------- --------------- ------------------------------ ------------------------------ --------------- --------------- --------------- --------------- ---------------xx.xx.xx.229:39010 PASS PASS PASS PASS PASS PASS PASS PASS PreCheck BE:server id ssh auth storage dir deploy dir webSer port heartbeat port brpc port be port open files count-------------------- --------------- ------------------------------ ------------------------------ --------------- --------------- --------------- --------------- ---------------xx.xx.xx.229:39060 PASS PASS PASS PASS PASS PASS PASS PASS [20220812-113236 OUTPUT] PreCheck successfully. RESPECT[20220812-113236 OUTPUT] Create the deploy folder ...[20220812-113237 OUTPUT] Download StarRocks package & jdk ...[20220812-113322 INFO] The file starrocks-2.2.2-quickstart.tar.gz [1695364308] download successfully[20220812-113322 OUTPUT] Download done.[20220812-113322 OUTPUT] Decompress StarRocks pakcage & jdk ...[20220812-113325 INFO] The tar file home/starrocks/.stargo/download/starrocks-2.2.2-quickstart.tar.gz has been decompressed under home/starrocks/.stargo/download[20220812-113349 INFO] The tar file home/starrocks/.stargo/download/StarRocks-2.2.2.tar.gz has been decompressed under home/starrocks/.stargo/download[20220812-113354 INFO] The tar file home/starrocks/.stargo/download/jdk-8u301-linux-x64.tar.gz has been decompressed under home/starrocks/.stargo/download[20220812-113354 OUTPUT] Distribute FE Dir ...[20220812-113402 INFO] Upload dir feSourceDir = [/home/starrocks/.stargo/download/StarRocks-2.2.2/fe] to feTargetDir = [/opt/StarRocks1/fe] on FeHost = [xx.xx.xx.229][20220812-113406 INFO] Upload dir JDKSourceDir = [/home/starrocks/.stargo/download/jdk1.8.0_301] to JDKTargetDir = [/opt/StarRocks1/fe/jdk] on FeHost = [xx.xx.xx.229][20220812-113406 INFO] Modify JAVA_HOME: host = [xx.xx.xx.229], filePath = [/opt/StarRocks1/fe/bin/start_fe.sh][20220812-113406 OUTPUT] Distribute BE Dir ...[20220812-113417 INFO] Upload dir BeSourceDir = [/home/starrocks/.stargo/download/StarRocks-2.2.2/be] to BeTargetDir = [/opt/StarRocks1/be] on BeHost = [xx.xx.xx.229][20220812-113417 OUTPUT] Modify configuration for FE nodes & BE nodes ...############################################# START FE CLUSTER ########################################################################################## START FE CLUSTER #############################################[20220812-113417 INFO] Starting leader FE node [host = xx.xx.xx.229, editLogPort = 39010][20220812-113438 WARN] The FE node doesn't start, wait for 10s [FeHost = xx.xx.xx.229, FeQueryPort = 39030, error = Process exited with status 1][20220812-113438 INFO] Starting leader FE node [host = xx.xx.xx.229, editLogPort = 39010][20220812-113454 INFO] The FE node start succefully [host = xx.xx.xx.229, queryPort = 39030][20220812-113454 INFO] List all FE status: feHost = xx.xx.xx.229 feQueryPort = 39030 feStatus = true############################################# START BE CLUSTER ########################################################################################## START BE CLUSTER #############################################[20220812-113454 INFO] Starting BE node [BeHost = xx.xx.xx.229 HeartbeatServicePort = 39050][20220812-113515 INFO] The BE node start succefully [host = xx.xx.xx.229, heartbeatServicePort = 39050][20220812-113515 OUTPUT] List all BE status: beHost = xx.xx.xx.229 beHeartbeatServicePort = 39050 beStatus = true
如果执行检查失败,会有相应提示,按提示操作即可
[starrocks@bigdata12 stargo-pkg]$ ./stargo cluster deploy star3 v2.2.2 deploy-star3.yaml [20220812-102653 OUTPUT] Deploy cluster [clusterName = star3, clusterVersion = v2.2.2, metaFile = deploy-star3.yaml][20220812-102658 OUTPUT] PRE CHECK DEPLOY ENV:PreCheck FE:server id ssh auth meta dir deploy dir http port rpc port query port edit log port open files count-------------------- --------------- ------------------------------ ------------------------------ --------------- --------------- --------------- --------------- ---------------xx.xx.xx.228:39010 PASS FAILED: Priv failed FAILED: Priv failed PASS PASS PASS PASS PASS PreCheck BE:server id ssh auth storage dir deploy dir webSer port heartbeat port brpc port be port open files count-------------------- --------------- ------------------------------ ------------------------------ --------------- --------------- --------------- --------------- ---------------xx.xx.xx.228:39060 PASS FAILED: Dir exist/Priv failed FAILED: Dir exist/Priv failed PASS PASS PASS PASS PASS xx.xx.xx.229:39060 PASS FAILED: Dir exist/Priv failed FAILED: Dir exist/Priv failed PASS PASS PASS PASS PASS xx.xx.xx.230:39060 PASS FAILED: Dir exist FAILED: Dir exist PASS PASS PASS PASS PASS [20220812-102658 ERROR] Please use bellowing promption to fix the issue for FE servers:Detect the FE META FOLDER exist or no privilege. Use bellowing command to check or fix the issue: [Host = xx.xx.xx.228] chown -R starrocks /data/starrocks1/fe/metaDetect the FE DEPLOY FOLDER exist or no privilege. Use bellowing command to check or fix the issue: [Host = xx.xx.xx.228] chown -R starrocks /opt/StarRocks1/fe[20220812-102658 ERROR] Please use bellowing promption to fix the issue for BE servers:Detect the BE STORAGE FOLDER exist or no privilege. Use bellowing command to check or fix the issue: [Host = xx.xx.xx.228] mkdir /data/starrocks1/be/storage.bak && mv /data/starrocks1/be/storage/* /data/starrocks1/be/storage.bak/ && chown -R starrocks /data/starrocks1/be/storage [Host = xx.xx.xx.229] mkdir /data/starrocks1/be/storage.bak && mv /data/starrocks1/be/storage/* /data/starrocks1/be/storage.bak/ && chown -R starrocks /data/starrocks1/be/storage [Host = xx.xx.xx.230] mkdir /data/starrocks1/be/storage.bak && mv /data/starrocks1/be/storage/* /data/starrocks1/be/storage.bak/Detect the BE DEPLOY FOLDER exist or no privilege. Use bellowing command to check or fix the issue: [Host = xx.xx.xx.228] mkdir /opt/StarRocks1/be.bak && mv /opt/StarRocks1/be/* /opt/StarRocks1/be.bak/ && chown -R starrocks /opt/StarRocks1/be [Host = xx.xx.xx.229] mkdir /opt/StarRocks1/be.bak && mv /opt/StarRocks1/be/* /opt/StarRocks1/be.bak/ && chown -R starrocks /opt/StarRocks1/be [Host = xx.xx.xx.230] mkdir /opt/StarRocks1/be.bak && mv /opt/StarRocks1/be/* /opt/StarRocks1/be.bak/[20220812-102658 ERROR] PreCheck failed.
查看集群信息
[starrocks@bigdata11 stargo-pkg]$ ./stargo cluster list[20220812-115246 OUTPUT] List all clustersClusterName Version User CreateDate MetaPath PrivateKey --------------- ---------- ---------- ------------------------- ------------------------------------------------------------ --------------------------------------------------star3 v2.2.2 starrocks 2022-08-12 11:34:17 /home/starrocks/.stargo/cluster/star3 /home/starrocks/.ssh/id_rsa
查看指定集群信息
[starrocks@bigdata11 stargo-pkg]$ ./stargo cluster display star3[20220812-143023 OUTPUT] Display cluster [clusterName = star3]clusterName = star3clusterVerison = v2.2.2ID ROLE HOST PORT STAT DATADIR DEPLOYDIR -------------------------- ------ -------------------- --------------- ---------- -------------------------------------------------- --------------------------------------------------xx.xx.xx.229:39010 FE xx.xx.xx.229 39010/39030 UP/L /opt/StarRocks1/fe /data/starrocks1/fe/meta xx.xx.xx.229:39060 BE xx.xx.xx.229 39060/39050 UP /opt/StarRocks1/be /data/starrocks1/be/storage [starrocks@bigdata11 stargo-pkg]$
升级版本
starrocks 2.2.0 升级到 2.3.0
[starrocks@bigdata11 stargo-pkg]$ ./stargo cluster upgrade star3 v2.3.0[20220812-143227 OUTPUT] Upgrade cluster. [ClusterName = star3, TargetVersion = v2.3.0][20220812-143227 OUTPUT] Upgrade StarRocks Cluster star3, from version v2.2.2 to version v2.3.0[20220812-143227 OUTPUT] Download StarRocks package & jdk ...[20220812-143340 INFO] The file starrocks-2.3.0-quickstart.tar.gz [1726845129] download successfully[20220812-143340 OUTPUT] Download done.[20220812-143340 OUTPUT] Decompress StarRocks pakcage & jdk ...[20220812-143346 INFO] The tar file /home/starrocks/.stargo/download/starrocks-2.3.0-quickstart.tar.gz has been decompressed under /home/starrocks/.stargo/download[20220812-143411 INFO] The tar file /home/starrocks/.stargo/download/StarRocks-2.3.0.tar.gz has been decompressed under /home/starrocks/.stargo/download[20220812-143411 INFO] The tar file /home/starrocks/.stargo/download/jdk-8u301-linux-x64.tar.gz has been decompressed under /home/starrocks/.stargo/download[20220812-143413 OUTPUT] Starting upgrade BE node. [beId = 0][20220812-143413 INFO] upgrade be node - backup be lib. [host = xx.xx.xx.229, sourceDir = /opt/StarRocks1/be/lib, targetDir = /opt/StarRocks1/be/lib.bak-20220812143413][20220812-143423 INFO] upgrade be node - upload new be lib. [host = xx.xx.xx.229, sourceDir = /home/starrocks/.stargo/download/StarRocks-2.3.0/be/lib, targetDir = /opt/StarRocks1/be/lib][20220812-143423 INFO] Waiting for stoping BE node [BeHost = xx.xx.xx.229][20220812-143429 INFO] upgrade be node - stop be node. [host = xx.xx.xx.229, beDeployDir = /opt/StarRocks1/be][20220812-143430 INFO] upgrade be node - start be node. [host = xx.xx.xx.229, beDeployDir = /opt/StarRocks1/be][20220812-143443 INFO] upgrade be node - start be node. [host = xx.xx.xx.229, beDeployDir = /opt/StarRocks1/be][20220812-143443 OUTPUT] The Be node upgrade successfully. [beId = 0, currentVersion = v2.3.0-a9bdb09][20220812-143443 OUTPUT] Starting upgrade FE node. [feId = 0][20220812-143445 INFO] upgrade FE node - backup FE lib. [host = xx.xx.xx.229, sourceDir = /opt/StarRocks1/fe/lib, targetDir = /opt/StarRocks1/fe/lib.bak-20220812143443][20220812-143448 INFO] upgrade FE node - upload new FE lib. [host = xx.xx.xx.229, sourceDir = /home/starrocks/.stargo/download/StarRocks-2.3.0/fe/lib, targetDir = /opt/StarRocks1/fe/lib][20220812-143448 INFO] Waiting for stoping FE node [FeHost = xx.xx.xx.229][20220812-143451 INFO] upgrade FE node - stop FE node. [host = xx.xx.xx.229, feDeployDir = /opt/StarRocks1/fe][20220812-143452 INFO] upgrade FE node - start FE node. [host = xx.xx.xx.229, feDeployDir = /opt/StarRocks1/fe][20220812-143452 ERROR] Error in ping db [dbPath = root:@tcp(xx.xx.xx.229:39030)/], error = dial tcp xx.xx.xx.229:39030: connect: connection refused[20220812-143502 INFO] upgrade FE node - start FE node. [host = xx.xx.xx.229, feDeployDir = /opt/StarRocks1/fe][20220812-143503 ERROR] The FE node upgrade failed. [feId = 0, targetVersion = v2.3.0, currentVersion = v]
* 注: 是有报错,但是升级成功了
升级后查看集群
[20220812-143555 OUTPUT] List all clustersClusterName Version User CreateDate MetaPath PrivateKey --------------- ---------- ---------- ------------------------- ------------------------------------------------------------ --------------------------------------------------star3 v2.3.0 starrocks 2022-08-12 14:35:03 /home/starrocks/.stargo/cluster/star3 /home/starrocks/.ssh/id_rsa
已知缺陷
StarGo 不会部署 BrokerStarGo 安装包包含 jdk,不会替换安装
发表评论
暂时没有评论,来抢沙发吧~