Greenplum 数据库是一种大规模并行处理(MPP)数据库服务器,其架构特别针对管理大规模分析型数据仓库以及商业智能工作负载而设计。
MPP(也被称为 shared nothing 架构)指有两个或者更多个处理器协同执行一个操作的系统,每一个处理器都有其自己的内存、操作系统和磁盘。
组件说明
Master
- Greenplum 数据库的 Master 是整个 Greenplum 数据库系统的入口,它接受连接和 SQL 查询并且把工作分布到 Segment 实例上
- Master 上不包含任何用户数据,数据只存在于 Segment 上
- Master 会认证客户端连接、处理到来的 SQL 命令、在 Segment 之间分布工作负载、协调每一个 Segment 返回的结果以及把最终结果呈现给客户端程序
- 收集 Segment 的执行结果
Segment
- Greenplum 数据库的 Segment 实例是独立的 PostgreSQL 数据库,每一个都存储了数据的一部分并且执行查询处理的主要部分
- 当一个用户通过 Greenplum 的 Master 连接到数据库并且发出一个查询时,在每一个 Segment 数据库上都会创建一些进程来处理该查询的工作
- 一台 Segment 主机通常运行2至8个 Greenplum 的 Segment 实例
- 执行 Master 分发的 SQL 语句
- 负责对应数据的存储和计算
Interconnect
- Interconect 是 Greenplum 数据库架构中的网络层
- Interconnect 指的是 Segment 之间的进程间通信以及这种通信所依赖的网络基础设施
- Interconnect 采用了一种标准的以太交换网络
- 默认情况下,Interconnect 使用 UDP 在网络上发送消息
集群部署
版本说明
版本:Centos7.0 + greenplum-db-6.19.1-rhel7-x86_64.rpm
下载地址:https://github.com/greenplum-db/gpdb/releases
主机列表
采用 4 台机器组成的集群:1 台机器是 Master 节点,1 台机器是 Standby 节点(这个节点相当于 Master 的备用节点),2 台机器是 Segment 节点。
192.168.3.31 mdw (Master 节点) 192.168.3.32 smdw (Standby 节点) 192.168.3.33 sdw1 (Segment 节点) 192.168.3.34 sdw2 (Segment 节点)
环境初始化
以下配置步骤,所有服务器均需要进行操作。
- 配置主机名及 IP 地址
- 关闭防火墙
配置的时候,要保证所有的机器的网络都是通的,并且每台机器的防火墙都是关闭的,避免存在网络不通的情况。
[root@mdw ~]# systemctl stop firewalld (停止防火墙) [root@mdw ~]# systemctl disable firewalld (关闭防火墙) [root@mdw ~]# systemctl status firewalld (查看防火墙状态)
- 关闭 SELinux
[root@mdw ~]# sed -i 's/SELINUX=enforcing/SELINUX=disabled/' /etc/selinux/config [root@mdw ~]# setenforce 0
- 安装依赖
[root@mdw ~]# yum install apr apr-util bash bzip2 curl krb5 libcurl libevent libxml2 libyaml zlib openldap openssh openssl openssl-libs perl readline rsync R sed tar zip -y
- 配置文件连接数
[root@mdw ~]# vim /etc/security/limits.conf * soft nofile 524288 * hard nofile 524288 * soft nproc 131072 * hard nproc 131072
- 配置进程数
[root@mdw ~]# vim /etc/security/limits.d/20-nproc.conf * soft nproc 131072 root soft nproc unlimited
- 修改内核配置参数
[root@mdw ~]# vim /etc/sysctl.conf kernel.shmall = 2097152 kernel.shmmax = 2147483648 kernel.shmmni = 4096 kernel.sem = 500 2048000 200 40960 kernel.sysrq = 1 kernel.core_uses_pid = 1 kernel.msgmnb = 65536 kernel.msgmax = 65536 kernel.msgmni = 2048 net.ipv4.tcp_syncookies = 1 net.ipv4.conf.default.accept_source_route = 0 net.ipv4.tcp_max_syn_backlog = 4096 net.ipv4.conf.all.arp_filter = 1 net.ipv4.ip_local_port_range = 10000 65535 net.core.netdev_max_backlog = 10000 net.core.rmem_max = 2097152 net.core.wmem_max = 2097152 vm.swappiness = 10 vm.zone_reclaim_mode = 0 vm.dirty_expire_centisecs = 500 vm.dirty_writeback_centisecs = 100 vm.dirty_background_ratio = 0 vm.dirty_ratio = 0 vm.dirty_background_bytes = 1610612736 vm.dirty_bytes = 4294967296
配置完成后,执行 sysctl -p
使配置生效。
- 添加 hosts 解析
主要是为之后 Greenplum 能够在各个节点之间相互通信做准备。
[root@mdw ~]# vim /etc/hosts 192.168.3.31 mdw 192.168.3.32 smdw 192.168.3.33 sdw1 192.168.3.34 sdw2
- 设置 /etc/sysconfig/network
注:每个节点机器的设置内容不同。
[root@mdw ~]# vim /etc/sysconfig/network # Created by anaconda networking=yes hostname=mdw
- 禁用 THP
[root@mdw ~]# grubby --update-kernel=ALL --args="transparent_hugepage=never"
- 禁用 RemoveIPC
[root@mdw ~]# vim /etc/systemd/logind.conf RemoveIPC=no
- 将所有服务器重启
[root@mdw ~]# init 6
GreenPlum 数据库安装
- 添加组和用户(所有节点)
创建 gpadmin 用户及用户组,将其作为安装 Greenplum 的操作系统用户。
[root@mdw ~]# groupadd gpadmin [root@mdw ~]# useradd -r -m -g gpadmin gpadmin [root@mdw ~]# passwd gpadmin # gpadmin.123
- 集群相互之间 ssh 免密登录(Master 节点)
[root@mdw ~]# su - gpadmin [gpadmin@mdw ~]$ cd /home/gpadmin/ # 生成密钥对(回车即可) [gpadmin@mdw ~]$ ssh-keygen -t rsa # 需输入对应主机的密码 [gpadmin@mdw ~]$ ssh-copy-id smdw [gpadmin@mdw ~]$ ssh-copy-id sdw1 [gpadmin@mdw ~]$ ssh-copy-id sdw2
验证如下:在 Master 节点 gpadmin 用户下,ssh sdw1 输入后,无需密码直接登录到了 sdw1 服务器。
[gpadmin@mdw ~]$ ssh sdw1 Last login: Sun Jul 31 19:21:57 2022 from mdw [gpadmin@sdw1 ~]$
- 安装 Greenplum 的 rpm 包(Master 节点)
[root@mdw ~]# cd /usr/local/ [root@mdw local]# sudo yum install ./open-source-greenplum-db-6.19.1-rhel7-x86_64.rpm # 将这个文件在 gpadmin 用户赋予权限 [root@mdw local]# chown -R gpadmin:gpadmin /usr/local/greenplum*
- 创建 hostlist,包含所有节点主机名(Master 节点)
[root@mdw ~]# su - gpadmin [root@mdw ~]# mkdir /home/gpadmin/conf [root@mdw ~]# cd /home/gpadmin/conf [gpadmin@mdw conf]$ vim hostlist mdw smdw sdw1 sdw2
- 创建 seg_hosts ,包含所有的 Segment Host 的主机名(Master 节点)
[gpadmin@mdw conf]$ vim seg_hosts sdw1 sdw2
- 使用 gpssh-exkeys 打通所有服务器(Master 节点)
[root@mdw ~]# su - gpadmin [gpadmin@mdw ~]$ cd /usr/local/greenplum-db-6.19.1/ [gpadmin@mdw greenplum-db-6.19.1]$ source /usr/local/greenplum-db-6.19.1/greenplum_path.sh [gpadmin@mdw greenplum-db-6.19.1]$ cd /home/gpadmin/conf [gpadmin@mdw conf]$ gpssh-exkeys -f hostlist [STEP 1 of 5] create local ID and authorize on local host ... /home/gpadmin/.ssh/id_rsa file exists ... key generation skipped [STEP 2 of 5] keyscan all hosts and update known_hosts file [STEP 3 of 5] retrieving credentials from remote hosts ... send to smdw ... send to sdw1 ... send to sdw2 [STEP 4 of 5] determine common authentication file content [STEP 5 of 5] copy authentication files to all remote hosts ... finished key exchange with smdw ... finished key exchange with sdw1 ... finished key exchange with sdw2 [INFO] completed successfully
- Standby 节点上安装 Greenplum
[root@smdw ~]# hostname smdw [root@smdw ~]# cd /usr/local/ [root@smdw local]# sudo yum install ./open-source-greenplum-db-6.19.1-rhel7-x86_64.rpm # 将这个文件在 gpadmin 用户赋予权限 [root@smdw local]# chown -R gpadmin:gpadmin /usr/local/greenplum*
- Segment 节点上安装 Greenplum(共 2 个节点)
[root@sdw1 ~]# hostname sdw1 [root@sdw1 ~]# cd /usr/local/ [root@sdw1 local]# sudo yum install ./open-source-greenplum-db-6.19.1-rhel7-x86_64.rpm # 将这个文件在 gpadmin 用户赋予权限 [root@sdw1 local]# chown -R gpadmin:gpadmin /usr/local/greenplum*
- 环境变量配置 .bash_profile(所有节点)
[root@mdw ~]# su - gpadmin [gpadmin@mdw ~]$ vim /home/gpadmin/.bash_profile source /usr/local/greenplum-db/greenplum_path.sh export MASTER_DATA_DIRECTORY=/home/gpadmin/data/master/gpseg-1 export PGPORT=5432 export PGUSER=gpadmin export PGDATABASE=gp_sydb export LD_PRELOAD=/lib64/libz.so.1 ps [gpadmin@mdw ~]$ source /home/gpadmin/.bash_profile
- 配置环境变量 GPHOME(所有节点)
[gpadmin@mdw ~]$ vim /usr/local/greenplum-db/greenplum_path.sh # 添加如下路径 GPHOME=/usr/local/greenplum-db
- 创建数据文件夹(Master 节点)
创建各个节点的数据文件夹,该步骤之后 hostlist 文件下所包含的机器都会创建 data 目录,data 目录下都会创建 master、primary、mirror 文件夹。
[gpadmin@mdw ~]$ cd /usr/local/greenplum-db gpssh -f /home/gpadmin/conf/hostlist mkdir data cd data mkdir master mkdir primary mkdir mirror
- 连通性检查
[gpadmin@mdw greenplum-db]$ gpcheckperf -f /home/gpadmin/conf/hostlist -r N -d /tmp /usr/local/greenplum-db-6.19.1/bin/gpcheckperf -f /home/gpadmin/conf/hostlist -r N -d /tmp ------------------- -- NETPERF TEST ------------------- ==================== == RESULT 2022-07-31T20:03:03.635228 ==================== Netperf bisection bandwidth test mdw -> smdw = 500.910000 sdw1 -> sdw2 = 500.540000 smdw -> mdw = 494.230000 sdw2 -> sdw1 = 492.510000 Summary: sum = 1988.19 MB/sec min = 492.51 MB/sec max = 500.91 MB/sec avg = 497.05 MB/sec median = 500.54 MB/sec
出现以上内容证明各个节点已经可以连通。
- 初始化配置文件编辑修改(Master 节点)
修改初始化文件,首先拷贝一个文件 gpinitsystem_config,再进行修改。
[gpadmin@mdw greenplum-db]$ mkdir /home/gpadmin/gpconfigs [gpadmin@mdw greenplum-db]$ cp /usr/local/greenplum-db/docs/cli_help/gpconfigs/gpinitsystem_config /home/gpadmin/gpconfigs/gpinitsystem_config [gpadmin@mdw ~]$ vim /home/gpadmin/gpconfigs/gpinitsystem_config # 该项配置 primary 数据存储位置,括号里边有几个就代表每台 Segment 主机创建几个 postgresql 数据库实例 declare -a DATA_DIRECTORY=(/home/gpadmin/data/primary /home/gpadmin/data/primary) # 该项配置设置 Master 节点机器名字 MASTER_HOSTNAME=mdw # 该项配置设置 Master 节点数据存储位置 MASTER_DIRECTORY=/home/gpadmin/data/master # 该项配置 mirror 数据存储位置,规则同 DATA_DIRECTORY,括号里边数量和 DATA_DIRECTORY 保持一致 declare -a MIRROR_DATA_DIRECTORY=(/home/gpadmin/data/mirror /home/gpadmin/data/mirror) # 该项配置设置默认数据库名字,和环境变量数据库名字保持一致,不然会失败 DATABASE_NAME=gp_sydb
在 /home/gpadmin/gpconfigs 新增一个配置文件 hostfile_gpinitsystem
[gpadmin@mdw ~]$ cd /home/gpadmin/gpconfigs [gpadmin@mdw gpconfigs]$ vim hostfile_gpinitsystem # 新增如下内容 sdw1 sdw2
- 初始化数据库
[gpadmin@mdw ~]$ gpinitsystem -c gpconfigs/gpinitsystem_config -h gpconfigs/hostfile_gpinitsystem 20220731:20:18:41:008655 gpinitsystem:mdw:gpadmin-[INFO]:-Checking new segment hosts, Completed 20220731:20:18:41:008655 gpinitsystem:mdw:gpadmin-[INFO]:-Greenplum Database Creation Parameters 20220731:20:18:41:008655 gpinitsystem:mdw:gpadmin-[INFO]:--------------------------------------- 20220731:20:18:41:008655 gpinitsystem:mdw:gpadmin-[INFO]:-Master Configuration 20220731:20:18:41:008655 gpinitsystem:mdw:gpadmin-[INFO]:--------------------------------------- 20220731:20:18:41:008655 gpinitsystem:mdw:gpadmin-[INFO]:-Master instance name = Greenplum Data Platform 20220731:20:18:41:008655 gpinitsystem:mdw:gpadmin-[INFO]:-Master hostname = mdw 20220731:20:18:41:008655 gpinitsystem:mdw:gpadmin-[INFO]:-Master port = 5432 20220731:20:18:41:008655 gpinitsystem:mdw:gpadmin-[INFO]:-Master instance dir = /home/gpadmin/data/master/gpseg-1 20220731:20:18:41:008655 gpinitsystem:mdw:gpadmin-[INFO]:-Master LOCALE = en_US.utf8 20220731:20:18:41:008655 gpinitsystem:mdw:gpadmin-[INFO]:-Greenplum segment prefix = gpseg 20220731:20:18:41:008655 gpinitsystem:mdw:gpadmin-[INFO]:-Master Database = gp_sydb 20220731:20:18:41:008655 gpinitsystem:mdw:gpadmin-[INFO]:-Master connections = 250 20220731:20:18:41:008655 gpinitsystem:mdw:gpadmin-[INFO]:-Master buffers = 128000kB 20220731:20:18:41:008655 gpinitsystem:mdw:gpadmin-[INFO]:-Segment connections = 750 20220731:20:18:41:008655 gpinitsystem:mdw:gpadmin-[INFO]:-Segment buffers = 128000kB 20220731:20:18:41:008655 gpinitsystem:mdw:gpadmin-[INFO]:-Checkpoint segments = 8 20220731:20:18:41:008655 gpinitsystem:mdw:gpadmin-[INFO]:-Encoding = UNICODE 20220731:20:18:41:008655 gpinitsystem:mdw:gpadmin-[INFO]:-Postgres param file = Off 20220731:20:18:41:008655 gpinitsystem:mdw:gpadmin-[INFO]:-Initdb to be used = /usr/local/greenplum-db-6.19.1/bin/initdb 20220731:20:18:41:008655 gpinitsystem:mdw:gpadmin-[INFO]:-GP_LIBRARY_PATH is = /usr/local/greenplum-db-6.19.1/lib 20220731:20:18:41:008655 gpinitsystem:mdw:gpadmin-[INFO]:-HEAP_CHECKSUM is = on 20220731:20:18:41:008655 gpinitsystem:mdw:gpadmin-[INFO]:-HBA_HOSTNAMES is = 0 20220731:20:18:41:008655 gpinitsystem:mdw:gpadmin-[INFO]:-Ulimit check = Passed 20220731:20:18:41:008655 gpinitsystem:mdw:gpadmin-[INFO]:-Array host connect type = Single hostname per node 20220731:20:18:41:008655 gpinitsystem:mdw:gpadmin-[INFO]:-Master IP address [1] = ::1 20220731:20:18:41:008655 gpinitsystem:mdw:gpadmin-[INFO]:-Master IP address [2] = 192.168.122.1 20220731:20:18:41:008655 gpinitsystem:mdw:gpadmin-[INFO]:-Master IP address [3] = 192.168.3.31 20220731:20:18:41:008655 gpinitsystem:mdw:gpadmin-[INFO]:-Master IP address [4] = fe80::2e1e:864b:fbd:fa96 20220731:20:18:41:008655 gpinitsystem:mdw:gpadmin-[INFO]:-Standby Master = Not Configured 20220731:20:18:41:008655 gpinitsystem:mdw:gpadmin-[INFO]:-Number of primary segments = 2 20220731:20:18:41:008655 gpinitsystem:mdw:gpadmin-[INFO]:-Total Database segments = 4 20220731:20:18:41:008655 gpinitsystem:mdw:gpadmin-[INFO]:-Trusted shell = ssh 20220731:20:18:41:008655 gpinitsystem:mdw:gpadmin-[INFO]:-Number segment hosts = 2 20220731:20:18:41:008655 gpinitsystem:mdw:gpadmin-[INFO]:-Mirroring config = OFF 20220731:20:18:41:008655 gpinitsystem:mdw:gpadmin-[INFO]:---------------------------------------- 20220731:20:18:41:008655 gpinitsystem:mdw:gpadmin-[INFO]:-Greenplum Primary Segment Configuration 20220731:20:18:41:008655 gpinitsystem:mdw:gpadmin-[INFO]:---------------------------------------- 20220731:20:18:41:008655 gpinitsystem:mdw:gpadmin-[INFO]:-sdw1 6000 sdw1 /home/gpadmin/data/primary/gpseg0 2 20220731:20:18:41:008655 gpinitsystem:mdw:gpadmin-[INFO]:-sdw1 6001 sdw1 /home/gpadmin/data/primary/gpseg1 3 20220731:20:18:41:008655 gpinitsystem:mdw:gpadmin-[INFO]:-sdw2 6000 sdw2 /home/gpadmin/data/primary/gpseg2 4 20220731:20:18:41:008655 gpinitsystem:mdw:gpadmin-[INFO]:-sdw2 6001 sdw2 /home/gpadmin/data/primary/gpseg3 5 Continue with Greenplum creation Yy|Nn (default=N): > y 20220731:20:18:45:008655 gpinitsystem:mdw:gpadmin-[INFO]:-Building the Master instance database, please wait... 20220731:20:18:49:008655 gpinitsystem:mdw:gpadmin-[INFO]:-Starting the Master in admin mode
出现如下内容,则表示安装成功:
注:若初始化失败,需要删除 /home/gpadmin/data 下的数据资源目录重新初始化。
- 设置远程登录(Master 节点)
如需要 Navicat、DBeaver 等工具能够链接,需要如下配置:
[gpadmin@mdw ~]$ echo "host all gpadmin 0.0.0.0/0 trust" >> /home/gpadmin/data/master/gpseg-1/pg_hba.conf # 重新加载数据库配置 gpstop -u
注:gpstop -u 的意思是将 pg_hba.conf 文件重新上传并运行 Master 中的 postgresql.conf 文件的参数,期间不会中断服务。已建立连接的用户将不会使用最新配置,重连后将应用最新。起到了使刚刚修改的 pg_hba.conf 配置文件生效的作用。
- 设置 Standby 节点(Master 节点)
[gpadmin@mdw ~]$ gpinitstandby -s smdw 20220731:21:15:16:013189 gpinitstandby:mdw:gpadmin-[INFO]:-Validating environment and parameters for standby initialization... 20220731:21:15:16:013189 gpinitstandby:mdw:gpadmin-[INFO]:-Checking for data directory /home/gpadmin/data/master/gpseg-1 on smdw 20220731:21:15:16:013189 gpinitstandby:mdw:gpadmin-[INFO]:------------------------------------------------------ 20220731:21:15:16:013189 gpinitstandby:mdw:gpadmin-[INFO]:-Greenplum standby master initialization parameters 20220731:21:15:16:013189 gpinitstandby:mdw:gpadmin-[INFO]:------------------------------------------------------ 20220731:21:15:16:013189 gpinitstandby:mdw:gpadmin-[INFO]:-Greenplum master hostname = mdw 20220731:21:15:16:013189 gpinitstandby:mdw:gpadmin-[INFO]:-Greenplum master data directory = /home/gpadmin/data/master/gpseg-1 20220731:21:15:16:013189 gpinitstandby:mdw:gpadmin-[INFO]:-Greenplum master port = 5432 20220731:21:15:16:013189 gpinitstandby:mdw:gpadmin-[INFO]:-Greenplum standby master hostname = smdw 20220731:21:15:16:013189 gpinitstandby:mdw:gpadmin-[INFO]:-Greenplum standby master port = 5432 20220731:21:15:16:013189 gpinitstandby:mdw:gpadmin-[INFO]:-Greenplum standby master data directory = /home/gpadmin/data/master/gpseg-1 20220731:21:15:16:013189 gpinitstandby:mdw:gpadmin-[INFO]:-Greenplum update system catalog = On Do you want to continue with standby master initialization? Yy|Nn (default=N): > y 20220731:21:15:24:013189 gpinitstandby:mdw:gpadmin-[INFO]:-Syncing Greenplum Database extensions to standby 20220731:21:15:25:013189 gpinitstandby:mdw:gpadmin-[INFO]:-The packages on smdw are consistent. 20220731:21:15:25:013189 gpinitstandby:mdw:gpadmin-[INFO]:-Adding standby master to catalog... 20220731:21:15:25:013189 gpinitstandby:mdw:gpadmin-[INFO]:-Database catalog updated successfully. 20220731:21:15:25:013189 gpinitstandby:mdw:gpadmin-[INFO]:-Updating pg_hba.conf file... 20220731:21:15:26:013189 gpinitstandby:mdw:gpadmin-[INFO]:-pg_hba.conf files updated successfully. 20220731:21:15:28:013189 gpinitstandby:mdw:gpadmin-[INFO]:-Starting standby master 20220731:21:15:28:013189 gpinitstandby:mdw:gpadmin-[INFO]:-Checking if standby master is running on host: smdw in directory: /home/gpadmin/data/master/gpseg-1 20220731:21:15:31:013189 gpinitstandby:mdw:gpadmin-[INFO]:-Cleaning up pg_hba.conf backup files... 20220731:21:15:32:013189 gpinitstandby:mdw:gpadmin-[INFO]:-Backup files of pg_hba.conf cleaned up successfully. 20220731:21:15:32:013189 gpinitstandby:mdw:gpadmin-[INFO]:-Successfully created standby master on smdw
验证 Standby 节点是否完成配置,在 Standby 的服务器上,验证相关端口是否运行:
[gpadmin@smdw ~]$ hostname smdw [gpadmin@smdw ~]$ ps -ef|grep postgres gpadmin 56777 1 1 21:15 ? 00:00:01 /usr/local/greenplum-db-6.19.1/bin/postgres -D /home/gpadmin/data/master/gpseg-1 -p 5432 -E gpadmin 56845 56777 0 21:15 ? 00:00:00 postgres: 5432, master logger process gpadmin 56846 56777 0 21:15 ? 00:00:00 postgres: 5432, startup process waiting for 000000010000000000000003 gpadmin 56847 56777 0 21:15 ? 00:00:00 postgres: 5432, checkpointer process gpadmin 56848 56777 0 21:15 ? 00:00:00 postgres: 5432, writer process gpadmin 56849 56777 0 21:15 ? 00:00:00 postgres: 5432, wal receiver process gpadmin 56951 54578 0 21:16 pts/0 00:00:00 grep --color=auto postgres
至此,GP 集群安装完成,并设置好了 Standby。
Greenplum 数据库验证
- 登录验证(Master 节点)
[gpadmin@mdw ~]$ psql -d postgres psql (9.4.26) Type "help" for help. # 查询数据库 postgres=# \l List of databases Name | Owner | Encoding | Collate | Ctype | Access privileges -----------+---------+----------+------------+------------+--------------------- gp_sydb | gpadmin | UTF8 | en_US.utf8 | en_US.utf8 | postgres | gpadmin | UTF8 | en_US.utf8 | en_US.utf8 | template0 | gpadmin | UTF8 | en_US.utf8 | en_US.utf8 | =c/gpadmin + | | | | | gpadmin=CTc/gpadmin template1 | gpadmin | UTF8 | en_US.utf8 | en_US.utf8 | =c/gpadmin + | | | | | gpadmin=CTc/gpadmin (4 rows) # 切换数据库 postgres=# \c dbname
数据库操作
停止和启动集群(Master 节点)
- 交互式关闭,等待所有数据库连接断开才停库
[gpadmin@mdw ~]$ gpstop
- 交互式关闭,强制断开所有数据库连接,直接停库
[gpadmin@mdw ~]$ gpstop -M fast
- 交互式启动
[gpadmin@mdw ~]$ gpstart
- 重启
[gpadmin@mdw ~]$ gpstop -ar
集群状态(Master 节点)
- 查看整个 GP 集群的状态(显示所有 Master 和 Segment 信息)
[gpadmin@mdw ~]$ gpstate -s
- 查看备用主机详细信息
[gpadmin@mdw ~]$ gpstate -f
- 显示 primary 和 mirror 的映射关系
[gpadmin@mdw ~]$ gpstate -c
- 显示 mirror 同步状态
[gpadmin@mdw ~]$ gpstate -m
- 查看 GP 的版本
[gpadmin@mdw ~]$ gpstate -i
添加 mirror
新安装的 Greenplum 集群只有 primary 节点,没有 mirror 节点。高可用性没得到保证。所以就需要为集群添加 mirror 节点。
注:数据量过大时添加 mirror,会有较大的磁盘压力,因为后台会一直同步数据。
- 查看系统配置表,只有 primary 节点(Master 节点)
[gpadmin@mdw ~]$ psql -d postgres psql (9.4.26) Type "help" for help. postgres=# SELECT * from gp_segment_configuration ; dbid | content | role | preferred_role | mode | status | port | hostname | address | datadir ------+---------+------+----------------+------+--------+------+----------+---------+----------------------------------- 1 | -1 | p | p | n | u | 5432 | mdw | mdw | /home/gpadmin/data/master/gpseg-1 2 | 0 | p | p | n | u | 6000 | sdw1 | sdw1 | /home/gpadmin/data/primary/gpseg0 4 | 2 | p | p | n | u | 6000 | sdw2 | sdw2 | /home/gpadmin/data/primary/gpseg2 3 | 1 | p | p | n | u | 6001 | sdw1 | sdw1 | /home/gpadmin/data/primary/gpseg1 5 | 3 | p | p | n | u | 6001 | sdw2 | sdw2 | /home/gpadmin/data/primary/gpseg3 6 | -1 | m | m | s | u | 5432 | smdw | smdw | /home/gpadmin/data/master/gpseg-1 (6 rows)
创建数据目录
- 在所有需要添加 mirror 的主机,创建存放 mirror 的数据目录,上述步骤中已创建(Master 节点):
[gpadmin@sdw1 ~]$ hostname sdw1 [gpadmin@sdw1 ~]$ ll /home/gpadmin/data/ 总用量 0 drwxrwxr-x 2 gpadmin gpadmin 6 8月 6 11:02 master drwxrwxr-x 2 gpadmin gpadmin 6 8月 6 11:02 mirror drwxrwxr-x 4 gpadmin gpadmin 34 8月 6 11:29 primary
生成 addmirror 文件
- 执行 gpaddmirrors 命令,键入存放 mirror 节点的日志,生成配置文件(Master 节点):
[gpadmin@mdw ~]$ gpaddmirrors -o ./addmirror 20220806:12:28:59:060581 gpaddmirrors:mdw:gpadmin-[INFO]:-Starting gpaddmirrors with args: -o ./addmirror 20220806:12:28:59:060581 gpaddmirrors:mdw:gpadmin-[INFO]:-local Greenplum Version: 'postgres (Greenplum Database) 6.19.1 build commit:0e314744a460630073b46cea7b7cf20a81e3da63 Open Source' 20220806:12:28:59:060581 gpaddmirrors:mdw:gpadmin-[INFO]:-master Greenplum Version: 'PostgreSQL 9.4.26 (Greenplum Database 6.19.1 build commit:0e314744a460630073b46cea7b7cf20a81e3da63 Open Source) on x86_64-unknown-linux-gnu, compiled by gcc (GCC) 6.4.0, 64-bit compiled on Jan 18 2022 13:51:20' 20220806:12:28:59:060581 gpaddmirrors:mdw:gpadmin-[INFO]:-Obtaining Segment details from master... 20220806:12:29:00:060581 gpaddmirrors:mdw:gpadmin-[INFO]:-Heap checksum setting consistent across cluster Enter mirror segment data directory location 1 of 2 > /home/gpadmin/data/mirror Enter mirror segment data directory location 2 of 2 > /home/gpadmin/data/mirror 20220806:12:29:25:060581 gpaddmirrors:mdw:gpadmin-[INFO]:-Configuration file output to ./addmirror successfully.
- 查看文件内容(Master 节点):
[gpadmin@mdw ~]$ cat addmirror 0|sdw2|7000|/home/gpadmin/data/mirror/gpseg0 1|sdw2|7001|/home/gpadmin/data/mirror/gpseg1 2|sdw1|7000|/home/gpadmin/data/mirror/gpseg2 3|sdw1|7001|/home/gpadmin/data/mirror/gpseg3
执行添加命令
[gpadmin@mdw ~]$ gpaddmirrors -i addmirror 20220810:16:22:04:009304 gpaddmirrors:mdw:gpadmin-[INFO]:-Starting to create new pg_hba.conf on primary segments 20220810:16:22:05:009304 gpaddmirrors:mdw:gpadmin-[INFO]:-Successfully modified pg_hba.conf on primary segments to allow replication connections 20220810:16:22:05:009304 gpaddmirrors:mdw:gpadmin-[INFO]:-4 segment(s) to add 20220810:16:22:05:009304 gpaddmirrors:mdw:gpadmin-[INFO]:-Validating remote directories 20220810:16:22:06:009304 gpaddmirrors:mdw:gpadmin-[INFO]:-Configuring new segments
查看节点状态
- 执行 gpstate -m 就可以看到 Data Status 的状态是 Synchronized(已同步的):
[gpadmin@mdw primary]$ gpstate -m 20220806:12:43:02:061084 gpstate:mdw:gpadmin-[INFO]:-Starting gpstate with args: -m 20220806:12:43:02:061084 gpstate:mdw:gpadmin-[INFO]:-local Greenplum Version: 'postgres (Greenplum Database) 6.19.1 build commit:0e314744a460630073b46cea7b7cf20a81e3da63 Open Source' 20220806:12:43:02:061084 gpstate:mdw:gpadmin-[INFO]:-master Greenplum Version: 'PostgreSQL 9.4.26 (Greenplum Database 6.19.1 build commit:0e314744a460630073b46cea7b7cf20a81e3da63 Open Source) on x86_64-unknown-linux-gnu, compiled by gcc (GCC) 6.4.0, 64-bit compiled on Jan 18 2022 13:51:20' 20220806:12:43:02:061084 gpstate:mdw:gpadmin-[INFO]:-Obtaining Segment details from master... 20220806:12:43:02:061084 gpstate:mdw:gpadmin-[INFO]:-------------------------------------------------------------- 20220806:12:43:02:061084 gpstate:mdw:gpadmin-[INFO]:--Current GPDB mirror list and status 20220806:12:43:02:061084 gpstate:mdw:gpadmin-[INFO]:--Type = Group 20220806:12:43:02:061084 gpstate:mdw:gpadmin-[INFO]:-------------------------------------------------------------- 20220806:12:43:02:061084 gpstate:mdw:gpadmin-[INFO]:- Mirror Datadir Port Status Data Status 20220806:12:43:02:061084 gpstate:mdw:gpadmin-[INFO]:- sdw2 /home/gpadmin/data/mirror/gpseg0 7000 Passive Synchronized 20220806:12:43:02:061084 gpstate:mdw:gpadmin-[INFO]:- sdw2 /home/gpadmin/data/mirror/gpseg1 7001 Passive Synchronized 20220806:12:43:02:061084 gpstate:mdw:gpadmin-[INFO]:- sdw1 /home/gpadmin/data/mirror/gpseg2 7000 Passive Synchronized 20220806:12:43:02:061084 gpstate:mdw:gpadmin-[INFO]:- sdw1 /home/gpadmin/data/mirror/gpseg3 7001 Passive Synchronized 20220806:12:43:02:061084 gpstate:mdw:gpadmin-[INFO]:--------------------------------------------------------------
- 查看系统配置表:
[gpadmin@mdw ~]$ psql -d postgres psql (9.4.26) Type "help" for help. postgres=# SELECT * from gp_segment_configuration ; dbid | content | role | preferred_role | mode | status | port | hostname | address | datadir ------+---------+------+----------------+------+--------+------+----------+---------+----------------------------------- 1 | -1 | p | p | n | u | 5432 | mdw | mdw | /home/gpadmin/data/master/gpseg-1 2 | 0 | p | p | n | u | 6000 | sdw1 | sdw1 | /home/gpadmin/data/primary/gpseg0 5 | 3 | p | p | n | u | 6001 | sdw2 | sdw2 | /home/gpadmin/data/primary/gpseg3 6 | -1 | m | m | s | u | 5432 | smdw | smdw | /home/gpadmin/data/master/gpseg-1 7 | 0 | m | m | n | d | 7000 | sdw2 | sdw2 | /home/gpadmin/data/mirror/gpseg0 10 | 3 | m | m | n | d | 7001 | sdw1 | sdw1 | /home/gpadmin/data/mirror/gpseg3 4 | 2 | p | p | s | u | 6000 | sdw2 | sdw2 | /home/gpadmin/data/primary/gpseg2 9 | 2 | m | m | s | u | 7000 | sdw1 | sdw1 | /home/gpadmin/data/mirror/gpseg2 3 | 1 | p | p | s | u | 6001 | sdw1 | sdw1 | /home/gpadmin/data/primary/gpseg1 8 | 1 | m | m | s | u | 7001 | sdw2 | sdw2 | /home/gpadmin/data/mirror/gpseg1 (10 rows)
常见问题及解决方法
- 使用 gpstate -s 查看整个 GP 集群的状态,报如下错误:
[gpadmin@mdw ~]$ gpstate -s 20220731:20:44:49:012783 gpstate:mdw:gpadmin-[INFO]:-Starting gpstate with args: -s 20220731:20:44:49:012783 gpstate:mdw:gpadmin-[CRITICAL]:-gpstate failed. (Reason='[Errno 2] No such file or directory: '/data/master/gpseg-1/postgresql.conf'') exiting...
解决方案: 检查环境环境变量配置
[gpadmin@mdw ~]$ cat /home/gpadmin/.bash_profile export PATH source /usr/local/greenplum-db/greenplum_path.sh export MASTER_DATA_DIRECTORY=/data/master/gpseg-1 export PGPORT=5432 export PGUSER=gpadmin export PGDATABASE=gp_sydb export LD_PRELOAD=/lib64/libz.so.1 ps
其中 MASTER_DATA_DIRECTORY
路径配置错误,应调整为(所有节点都需要同步调整):
export MASTER_DATA_DIRECTORY=/home/gpadmin/data/master/gpseg-1
参考资料
https://blog.csdn.net/msh6453/article/details/124892605
https://blog.51cto.com/feko/2738318
http://events.jianshu.io/p/e0d70fd545dc
https://www.bilibili.com/video/BV1XJ411L7Gv
原创文章,转载请注明出处:http://www.opcoder.cn/article/59/