Cloud Native PostgreSQL Operator

카테고리 없음

Cloud Native PostgreSQL Operator

참도다리 2023. 11. 5. 03:18

목차
1. PostgreSQL 소개
2. CloudNativePG 소개 및 설치
3. CloudNativePG 기본 사용
4. CloudNativePG 장애 테스트
5. CloudNativePG Scale & 롤링 업데이트
6. CloudNativePG 기타

1. PostgreSQL 소개

PostgreSQL이란?

Global Development Group에서 개발하는 오픈 소스 ORDBMS이다.

PostgreSQL은 일반적으로 "Post-GRES"로 발음하며 뛰어난 안정성, 유연성, 개방형 기술 표준 지원으로 잘 알려져 있는 오픈 소스 데이터베이스이다.PostgreSQL는 다른 RDMBS(관계형 데이터베이스 관리 시스템)와는 달리 비관계형 및 관계형 데이터 유형을 모두 지원한다.

2. CloudNativePG 소개 및 설치

쿠버네티스 환경에서 PostgreSQL 워크로드를 관리한다.

CloudNativePG 는 EDB가 개발 후, Apache License 2.0 공개 및 ‘22.4 CNCF Sandbox 제출함, 오퍼레이터 Level V - Auto Pilot 지원한다.(오퍼레이터 레벨 중에서 지원이 가장 많이 되는 Level V 수준에 해당하는 기능을 모두 지원)

<https://cloudnative-pg.io/documentation/current/operator_capability_levels/>

지원하는 기능

- Direct integration with Kubernetes API server for High Availability, without requiring an external tool
- Self-Healing capability, through:
- failover of the primary instance by promoting the most aligned replica
- automated recreation of a replica
- Planned switchover of the primary instance by promoting a selected replica
- Scale up/down capabilities
- Definition of an arbitrary number of instances (minimum 1 - one primary server)-
- Definition of the read-write service, to connect your applications to the only primary server of the cluster
- Definition of the read-only service, to connect your applications to any of the instances for reading - workloads
- Declarative management of PostgreSQL configuration, including certain popular Postgres extensions through the cluster spec: pgaudit, auto_explain, pg_stat_statements, and pg_failover_slots
- Declarative management of Postgres roles, users and groups
- Support for Local Persistent Volumes with PVC templates
- Reuse of Persistent Volumes storage in Pods
- Separate volume for WAL files
- Rolling updates for PostgreSQL minor versions
- In-place or rolling updates for operator upgrades
- TLS connections and client certificate authentication
- Support for custom TLS certificates (including integration with cert-manager)
- Continuous WAL archiving to an object store (AWS S3 and S3-compatible, Azure Blob Storage, and Google Cloud Storage)
- Backups on volume snapshots (where supported by the underlying storage classes)
- Backups on object stores (AWS S3 and S3-compatible, Azure Blob Storage, and Google Cloud Storage)
- Full recovery and Point-In-Time recovery from an existing backup on volume snapshots or object stores
- Offline import of existing PostgreSQL databases, including major upgrades of PostgreSQL
- Fencing of an entire PostgreSQL cluster, or a subset of the instances in a declarative way
- Hibernation of a PostgreSQL cluster in a declarative way
- Support for Synchronous Replicas
- Support for HA physical replication slots at cluster level
- Backup from a standby
- Backup retention policies (based on recovery window, only on object stores)
- Parallel WAL archiving and restore to allow the database to keep up with WAL generation on high write systems
- Support tagging backup files uploaded to an object store to enable optional retention management at the object store layer Replica clusters for
- PostgreSQL deployments across multiple Kubernetes clusters, enabling private, public, hybrid, and multi-cloud architectures
- Connection pooling with PgBouncer
- Support for node affinity via nodeSelector
- Native customizable exporter of user defined metrics for Prometheus through the metrics port (9187)
- Standard output logging of PostgreSQL error messages in JSON format
- Automatically set readOnlyRootFilesystem security context for pods
cnpg plugin for kubectl
- Simple bind and search+bind LDAP client authentication
- Multi-arch format container images
- OLM installation

복제 전략 : 스토리지 수준 복제 vs 애플리케이션 수준 복제

PostgreSQL는 스토리지 수준 복제를 권장하지 않는다 : https://youtu.be/99uSJXkKpeI?si=4Shn7hdxmCt4IDnO
HA 와 Scale 를 위해서 built-in physical replication capabilities based on Write Ahead Log (WAL) shipping
WAL Write-Ahead Logging : PostgreSQL 에서 데이터를 일관성 있게 유지하기 위해서 WAL 파일을 사용
- 로그 Log : 로그란 DBMS에서 변경되는 모든 작업들을 표시하고 기록하는 개념이다. 로그가 저장되는 곳은 데이터가 손실될 걱정이 없는 영구적 저장소이다.
- WAL 파일 : 데이터를 손실없이 정확하게 저장하기 위해서는 로그를 사용하여 다시 돌아갈 방법을 대비해놓아야 한다.
  - 완벽하게 데이터 손실을 방지하기 위해서 로그를 작성하지 않고는 다음 데이터 변경 단계로 넘어갈 수 없게 하는 방법을 사용할 수 있다.
  - 이런 원리를 이용해서 데이터 변경을 하기 전에 변경 사항을 미리 기록해두는 파일이 WAL 파일이다.
  - 변경사항을 미리 기록하기 때문에 WAL 을 사용한다는 것은 데이터 무결성을 보장함.
PostgreSQL 는 비동기 혹은 동기 스트리밍 복제를 지원, 복제서버는 Hot Standby 기능으로 읽기 처리가 가능
- PostgreSQL supports both asynchronous and synchronous streaming replication over the network, as well as asynchronous file-based log shipping (normally used as a fallback option, for example, to store WAL files in an object store).
- Replicas are usually called standby servers and can also be used for read-only workloads, thanks to the Hot Standby feature.
CloudNativePG 는 비동기 혹은 동기 스트리밍 복제 구성의 클러스터를 지원한다.

출처 : https://cloudnative-pg.io/documentation/current/architecture/

CloudNativePG supports clusters based on asynchronous and synchronous streaming replication to manage multiple hot standby replicas within the same Kubernetes cluster, with the following specifications:
One primary, with optional multiple hot standby replicas for High Availability
Available services for applications:
- rw: applications connect to the only primary instance of the cluster
- ro: applications connect to the only hot standby replicas for read-only-workloads
- r: applications connect to any of the instances for read-only workloads

Read-write workloads : -RW 요청을 프라이머리로 전달

출처 : https://cloudnative-pg.io/documentation/current/architecture/

Read-only workloads : -RO 요청을 Standby 로 Round robin 방식으로 전달 , 참고로 -R 요청은 any 전달

출처 : https://cloudnative-pg.io/documentation/current/architecture/

Multi-cluster deployments : 2개의 다른 쿠버네티스 클러스터에서, Replica Cluster 에 Designated Primary 는 Primary 클러스터의 Primary 의 복제

출처 : https://cloudnative-pg.io/documentation/current/architecture/

설치 - Docs Quickstart OLM CharacterSet Replication Bootstrap Examples

# Install Operator Lifecycle Manager (OLM), a tool to help manage the Operators running on your cluster.
curl -sL https://github.com/operator-framework/operator-lifecycle-manager/releases/download/v0.25.0/install.sh | bash -s v0.25.0
kubectl get ns
kubectl get all -n olm
kubectl get-all -n olm
kubectl get all -n operators
kubectl get-all -n operators | grep -v packagemanifest

# Install the operator by running the following command
# This Operator will be installed in the "operators" namespace and will be usable from all namespaces in the cluster.
curl -s -O https://operatorhub.io/install/cloudnative-pg.yaml
cat cloudnative-pg.yaml | yh
kubectl create -f cloudnative-pg.yaml
kubectl get all -n operators
kubectl get-all -n operators | grep -v packagemanifest
kubectl get crd | grep cnpg
backups.postgresql.cnpg.io                    2023-10-29T02:47:55Z
clusters.postgresql.cnpg.io                   2023-10-29T02:47:56Z
poolers.postgresql.cnpg.io                    2023-10-29T02:47:56Z
scheduledbackups.postgresql.cnpg.io           2023-10-29T02:47:56Z

## api 정보 확인
kubectl explain clusters
kubectl explain backups

# fter install, watch your operator come up using next command.
# To use it, checkout the custom resource definitions (CRDs) introduced by this operator to start using it.
kubectl get clusterserviceversions -n operators # csv = clusterserviceversions 축약
kubectl get csv -A

---------------------------------------
# 클러스터 설치 Deploy a PostgreSQL cluster : 버전 15.3 (추후 15.4 롤링 업데이트 예정)
## 3대 파드 구성(프라이머리 1대, 스탠드바이 2대) , config parameters 설정, pg_hba 설정, bootstrap 설정, 파드 모니터 설정
cat <<EOT> mycluster1.yaml
# Example of PostgreSQL cluster
apiVersion: postgresql.cnpg.io/v1
kind: Cluster
metadata:
  name: mycluster
spec:
  imageName: ghcr.io/cloudnative-pg/postgresql:15.3
  instances: 3  
  storage:
    size: 3Gi
  postgresql:
    parameters:
      max_worker_processes: "40"
      timezone: "Asia/Seoul"
    pg_hba:
      - host all postgres all trust
  primaryUpdateStrategy: unsupervised 
  enableSuperuserAccess: true
  bootstrap:
    initdb:
      database: app
      encoding: UTF8
      localeCType: C
      localeCollate: C
      owner: app

  monitoring:
    enablePodMonitor: true
EOT

# 설치되는 파드 순서 확인 : job.batch(initdb -> join)
kubectl apply -f mycluster1.yaml && kubectl get pod -w

# 확인
kubectl get pod,pvc,pv,svc,ep
kubectl df-pv
kubectl describe pod -l cnpg.io/cluster=mycluster # TCP 9187 메트릭 제공
kubectl get pod -l cnpg.io/cluster=mycluster -owide
curl -s <파드IP>:9187/metrics
curl -s 192.168.1.84:9187/metrics

# 프로메테우스 ingress 도메인으로 웹 접속
## Targets -> podMonitor/default/cluster-with-metrics 확인
## Graph : cnpg 로 메트릭 확인, 예) cnpg_pg_replication_lag
echo -e "Prometheus Web URL = https://prometheus.$MyDomain"

# 그라파나 대시보드 설정 : CloudNativePG 대시보드
kubectl apply -f https://raw.githubusercontent.com/cloudnative-pg/cloudnative-pg/main/docs/src/samples/monitoring/grafana-configmap.yaml

# 그라파나 ingress 도메인으로 웹 접속 : 기본 계정 - admin / prom-operator
echo -e "Grafana Web URL = https://grafana.$MyDomain"

# 설치된 클러스터 확인
kubectl get-all --since 7m
kubectl get cluster
NAME        AGE     INSTANCES   READY   STATUS                     PRIMARY
mycluster   4m32s   3           3       Cluster in healthy state   mycluster-1

# cnpg 플러그인 설치 : 방안1(krew 활용), 방안2(직접 설치)
## (직접설치) curl -sSfL https://github.com/cloudnative-pg/cloudnative-pg/raw/main/hack/install-cnpg-plugin.sh | sudo sh -s -- -b /usr/local/bin
kubectl krew install cnpg
kubectl cnpg status mycluster
Cluster Summary
Name:                mycluster
Namespace:           default
System ID:           7295211024748187666
PostgreSQL Image:    ghcr.io/cloudnative-pg/postgresql:15.3
Primary instance:    mycluster-1
Primary start time:  2023-10-29 03:04:52 +0000 UTC (uptime 1h32m21s)
Status:              Cluster in healthy state
Instances:           3
Ready instances:     3
Current Write LSN:   0/7000000 (Timeline: 1 - WAL File: 000000010000000000000006)

Certificates Status
Certificate Name       Expiration Date                Days Left Until Expiration
----------------       ---------------                --------------------------
mycluster-ca           2024-01-27 02:59:05 +0000 UTC  89.93
mycluster-replication  2024-01-27 02:59:05 +0000 UTC  89.93
mycluster-server       2024-01-27 02:59:05 +0000 UTC  89.93

Continuous Backup status
Not configured

Streaming Replication status
Replication Slots Enabled
Name         Sent LSN   Write LSN  Flush LSN  Replay LSN  Write Lag  Flush Lag  Replay Lag  State      Sync State  Sync Priority  Replication Slot
----         --------   ---------  ---------  ----------  ---------  ---------  ----------  -----      ----------  -------------  ----------------
mycluster-2  0/7000000  0/7000000  0/7000000  0/7000000   00:00:00   00:00:00   00:00:00    streaming  async       0              active
mycluster-3  0/7000000  0/7000000  0/7000000  0/7000000   00:00:00   00:00:00   00:00:00    streaming  async       0              active

Unmanaged Replication Slot Status
No unmanaged replication slots found

Instances status
Name         Database Size  Current LSN  Replication role  Status  QoS         Manager Version  Node
----         -------------  -----------  ----------------  ------  ---         ---------------  ----
mycluster-1  29 MB          0/7000000    Primary           OK      BestEffort  1.21.0           ip-192-168-3-29.ap-northeast-2.compute.internal
mycluster-2  29 MB          0/7000000    Standby (async)   OK      BestEffort  1.21.0           ip-192-168-2-231.ap-northeast-2.compute.internal
mycluster-3  29 MB          0/7000000    Standby (async)   OK      BestEffort  1.21.0           ip-192-168-1-211.ap-northeast-2.compute.internal

kubectl cnpg status mycluster --verbose  # -v , config 설정 적용 확인
...
PostgreSQL HBA Rules
# Grant local access
local all all peer map=local
# Require client certificate authentication for the streaming_replica user
hostssl postgres streaming_replica all cert
hostssl replication streaming_replica all cert
hostssl all cnpg_pooler_pgbouncer all cert
...

# 기본 리소스들 확인 : sts,deploy 사용하지 않음! , 오퍼레이터가 직접 개별 파드를 생성 <- Why?
kubectl get pod,deploy
kubectl get svc,ep,endpointslices -l cnpg.io/cluster=mycluster
kubectl get cm,secret
kubectl get pdb

# PV/PVC 확인
kubectl df-pv
kubectl get pvc,pv

## 아래 AWS EBS를 현재 PV로 사용하고 있는 상태에서, 아래 Node Affinity는 어떤 의미인가?
kubectl describe pv | grep 'Node Affinity:' -A2
Node Affinity:     
  Required Terms:  
    Term 0:        topology.ebs.csi.aws.com/zone in [ap-northeast-2c]
--
Node Affinity:     
  Required Terms:  
    Term 0:        topology.ebs.csi.aws.com/zone in [ap-northeast-2a]
--
Node Affinity:     
  Required Terms:  
    Term 0:        topology.ebs.csi.aws.com/zone in [ap-northeast-2b]

# (참고) 리소스 및 이벤트 YAML 확인
kubectl cnpg report cluster mycluster
unzip ~

# (참고) 로깅 - 링크
kubectl logs mycluster-1 -c bootstrap-controller
kubectl logs mycluster-2 -c bootstrap-controller

kubectl logs mycluster-1 | jq 'select(.logger=="postgres") | .record.message'
kubectl logs mycluster-2 | jq 'select(.logger=="postgres") | .record.message'
kubectl logs mycluster-3 | jq 'select(.logger=="postgres") | .record.message'
...
"entering standby mode"
...

3. CloudNativePG 기본 사용

# 2개의 자격 증명이 저장된 secret 확인
kubectl get secret -l cnpg.io/cluster=mycluster
NAME                  TYPE                       DATA   AGE
mycluster-app         kubernetes.io/basic-auth   3      109m
mycluster-superuser   kubernetes.io/basic-auth   3      109m

# superuser 계정명
kubectl get secrets mycluster-superuser -o jsonpath={.data.username} | base64 -d ;echo
postgres

# superuser 계정 암호
kubectl get secrets mycluster-superuser -o jsonpath={.data.password} | base64 -d ;echo
UUTWc0Apwp8i0BbXd06ja6B4LURuFPAsuSlV6XKzoebMqEksQdANdNF0cPlQxio0

kubectl get secrets mycluster-superuser -o jsonpath={.data.pgpass} | base64 -d
mycluster-rw:5432:*:postgres:TOk00xrh8kuT1uJJpMOuwotWag3dsFieXIuZwzn04OZYoyA8GPdza2xu8OgiUMTd

# app 계정명
kubectl get secrets mycluster-app -o jsonpath={.data.username} | base64 -d ;echo
app

# app 계정 암호
kubectl get secrets mycluster-app -o jsonpath={.data.password} | base64 -d ;echo
nLDgzc1ZGTT5BC1NV8Rjw1bfO7yuwoATg01WgjpOyTxyUyckrmLVHbQ3oI4IJUYp

# app 계정 암호 변수 지정
AUSERPW=$(kubectl get secrets mycluster-app -o jsonpath={.data.password} | base64 -d)

# myclient 파드 3대 배포 : envsubst 활용
## PODNAME=myclient1 VERSION=15.3.0 envsubst < myclient.yaml | kubectl apply -f -
curl -s https://raw.githubusercontent.com/gasida/DOIK/main/5/myclient-new.yaml -o myclient.yaml
for ((i=1; i<=3; i++)); do PODNAME=myclient$i VERSION=15.3.0 envsubst < myclient.yaml | kubectl apply -f - ; done

# [myclient1] superuser 계정으로 mycluster-rw 서비스 접속
kubectl exec -it myclient1 -- psql -U postgres -h mycluster-rw -p 5432 --variable=HISTFILE=/tmp/.psql_history
-------------
# 연결 정보 확인
postgres=# \conninfo
You are connected to database "postgres" as user "postgres" on host "mycluster-rw" (address "10.200.1.40") at port "5432".
SSL connection (protocol: TLSv1.3, cipher: TLS_AES_256_GCM_SHA384, bits: 256, compression: off)

# 데이터베이스 조회
postgres=# \l
List of databases
   Name    |  Owner   | Encoding | Collate | Ctype |   Access privileges
-----------+----------+----------+---------+-------+-----------------------
 app       | app      | UTF8     | C       | C     |
 postgres  | postgres | UTF8     | C       | C     |
 template0 | postgres | UTF8     | C       | C     | =c/postgres          +
           |          |          |         |       | postgres=CTc/postgres
 template1 | postgres | UTF8     | C       | C     | =c/postgres          +
           |          |          |         |       | postgres=CTc/postgres
(4 rows)

# 타임존 확인
postgres=# SELECT * FROM pg_timezone_names WHERE name = current_setting('TIMEZONE');
    name    | abbrev | utc_offset | is_dst
------------+--------+------------+--------
 Asia/Seoul | KST    | 09:00:00   | f
(1 row)

# 빠져나오기
postgres=# \q
-------------

# [myclient1] superuser 계정으로 mycluster-rw 서비스 접속하여 데이터베이스 리스트 조회
kubectl exec -it myclient1 -- psql -U postgres -h mycluster-rw -p 5432 -l


# [myclient1] app 계정으로 mycluster-rw 서비스 접속하여 app 데이터베이스 이동 >> app 계정 암호 직접 입력
kubectl exec -it myclient1 -- psql -U app -h mycluster-rw -p 5432 -d app -W  --variable=HISTFILE=/tmp/.psql_history
or09VUb1rCWTVFqVWKWURls23wsslhC8Ru0UFYQ2nBnKFRiRPfqdcyXehba12nSw
-------------
app=> \conninfo
You are connected to database "app" as user "app" on host "mycluster-rw" (address "10.200.1.40") at port "5432".
SSL connection (protocol: TLSv1.3, cipher: TLS_AES_256_GCM_SHA384, bits: 256, compression: off)

app=> \l
app=> \dt
app=> \q
-------------

외부에서 접속 → 보안을 위해서 접속 통제를 하거나, CLB 대신 NLB(보안 그룹, Internal) 설정을 권장

# postgresql psql 툴 설치
yum install postgresql -y

# Service(LoadBalancer)로 외부 노출 설정 : 3~5분 정도 대기 후 아래 접속 시도
kubectl get svc,ep mycluster-rw
kubectl patch svc mycluster-rw -p '{"spec":{"type":"LoadBalancer"}}'
kubectl annotate service mycluster-rw "external-dns.alpha.kubernetes.io/hostname=psql.$MyDomain"

# 
psql -U postgres -h psql.$MyDomain
----------------------------------
app=> \conninfo
app=> \l
app=> \q
----------------------------------

테스트를 위해 DVD Rental Sample Database 불러오기 및 복제 확인 및 Service 서비스별 접속 차이 확인 - Link

# 다운로드
curl -LO https://www.postgresqltutorial.com/wp-content/uploads/2019/05/dvdrental.zip
unzip dvdrental.zip

# myclient1 파드에 dvdrental.tar 복사
kubectl cp dvdrental.tar myclient1:/tmp

# [myclient1] superuser 계정으로 mycluster-rw 서비스 접속 후 데이터베이스 생성
kubectl exec -it myclient1 -- createdb -U postgres -h mycluster-rw -p 5432 dvdrental
kubectl exec -it myclient1 -- psql -U postgres -h mycluster-rw -p 5432 -l

# DVD Rental Sample Database 불러오기
kubectl exec -it myclient1 -- pg_restore -U postgres -d dvdrental /tmp/dvdrental.tar -h mycluster-rw -p 5432

# DVD Rental Sample Database 에서 actor 테이블 조회
kubectl exec -it myclient1 -- psql -U postgres -h mycluster-rw -p 5432 -d dvdrental -c "SELECT * FROM actor"

각 파드에 접근해서 DVD Rental Sample Database 복제 동기화 확인

# 각각 mycluster-ro 와 mycluster-r(mycluster-any) 서비스에 접속해보자
kubectl get svc,ep,endpointslices -l cnpg.io/cluster=mycluster

# 파드IP 변수 지정
POD1=$(kubectl get pod mycluster-1 -o jsonpath={.status.podIP})
POD2=$(kubectl get pod mycluster-2 -o jsonpath={.status.podIP})
POD3=$(kubectl get pod mycluster-3 -o jsonpath={.status.podIP})

# 파드별 actor 테이블 카운트 조회 
kubectl exec -it myclient1 -- psql -U postgres -h $POD1 -p 5432 -d dvdrental -c "SELECT COUNT(*) FROM actor"
kubectl exec -it myclient1 -- psql -U postgres -h $POD2 -p 5432 -d dvdrental -c "SELECT COUNT(*) FROM actor"
kubectl exec -it myclient1 -- psql -U postgres -h $POD3 -p 5432 -d dvdrental -c "SELECT COUNT(*) FROM actor"

# (참고) 각 서비스 접근 조회
kubectl exec -it myclient1 -- psql -U postgres -h mycluster-ro -p 5432 -d dvdrental -c "SELECT * FROM actor"
kubectl exec -it myclient1 -- psql -U postgres -h mycluster-rw -p 5432 -d dvdrental -c "SELECT * FROM actor"
kubectl exec -it myclient1 -- psql -U postgres -h mycluster-r  -p 5432 -d dvdrental -c "SELECT * FROM actor"

rw ,ro , r 차이 확인

# rw
kubectl exec -it myclient1 -- psql -U postgres -h mycluster-rw -p 5432 -c "select inet_server_addr();"
for i in {1..30}; do kubectl exec -it myclient1 -- psql -U postgres -h mycluster-rw -p 5432 -c "select inet_server_addr();"; done | sort | uniq -c | sort -nr | grep 192

# ro
kubectl exec -it myclient1 -- psql -U postgres -h mycluster-ro -p 5432 -c "select inet_server_addr();"
for i in {1..30}; do kubectl exec -it myclient1 -- psql -U postgres -h mycluster-ro -p 5432 -c "select inet_server_addr();"; done | sort | uniq -c | sort -nr | grep 192

# r
kubectl exec -it myclient1 -- psql -U postgres -h mycluster-r -p 5432 -c "select inet_server_addr();"
for i in {1..30}; do kubectl exec -it myclient1 -- psql -U postgres -h mycluster-r -p 5432 -c "select inet_server_addr();"; done | sort | uniq -c | sort -nr | grep 192

4.CloudNativePG 장애 테스트

장애 테스트를 위한 준비

# 파드IP 변수 지정
POD1=$(kubectl get pod mycluster-1 -o jsonpath={.status.podIP})
POD2=$(kubectl get pod mycluster-2 -o jsonpath={.status.podIP})
POD3=$(kubectl get pod mycluster-3 -o jsonpath={.status.podIP})

# query.sql
curl -s -O https://raw.githubusercontent.com/gasida/DOIK/main/5/query.sql
cat query.sql ;echo
CREATE DATABASE test;
\c test;
CREATE TABLE t1 (c1 INT PRIMARY KEY, c2 TEXT NOT NULL);
INSERT INTO t1 VALUES (1, 'Luis');

# SQL 파일 query 실행
psql -U postgres -h psql.$MyDomain -f query.sql
혹은
kubectl cp query.sql myclient1:/tmp
kubectl exec -it myclient1 -- psql -U postgres -h mycluster-rw -p 5432 -f /tmp/query.sql

# [터미널2] 모니터링
while true; do kubectl exec -it myclient2 -- psql -U postgres -h mycluster-ro -p 5432 -d test -c "SELECT COUNT(*) FROM t1"; date;sleep 1; done

# 확인
kubectl exec -it myclient1 -- psql -U postgres -h mycluster-ro -p 5432 -d test -c "SELECT COUNT(*) FROM t1"
kubectl exec -it myclient1 -- psql -U postgres -h mycluster-ro -p 5432 -d test -c "SELECT * FROM t1"

# INSERT
psql -U postgres -h psql.$MyDomain -d test -c "INSERT INTO t1 VALUES (2, 'Luis2');"
psql -U postgres -h psql.$MyDomain -d test -c "SELECT * FROM t1"
혹은
kubectl exec -it myclient1 -- psql -U postgres -h mycluster-rw -p 5432 -d test -c "INSERT INTO t1 VALUES (2, 'Luis2');"
kubectl exec -it myclient1 -- psql -U postgres -h mycluster-ro -p 5432 -d test -c "SELECT * FROM t1"

# test 데이터베이스에 97개의 데이터 INSERT
#for ((i=3; i<=100; i++)); do psql -U postgres -h $POD1 -p 5432 -d test -c "INSERT INTO t1 VALUES ($i, 'Luis$i');";echo; done
for ((i=3; i<=100; i++)); do kubectl exec -it myclient1 -- psql -U postgres -h mycluster-rw -p 5432 -d test -c "INSERT INTO t1 VALUES ($i, 'Luis$i');";echo; done
kubectl exec -it myclient1 -- psql -U postgres -h mycluster-ro -p 5432 -d test -c "SELECT COUNT(*) FROM t1"Link

[장애1] 프라이머리 파드(인스턴스) 1대 강제 삭제 및 동작 확인 - Link

# 프라이머리 파드 정보 확인
kubectl cnpg status mycluster

# [터미널1] 모니터링
watch kubectl get pod -l cnpg.io/cluster=mycluster

# [터미널2] 모니터링
while true; do kubectl exec -it myclient2 -- psql -U postgres -h mycluster-ro -p 5432 -d test -c "SELECT COUNT(*) FROM t1"; date;sleep 1; done

# [터미널3] test 데이터베이스에 다량의 데이터 INSERT
for ((i=301; i<=10000; i++)); do kubectl exec -it myclient2 -- psql -U postgres -h mycluster-rw -p 5432 -d test -c "INSERT INTO t1 VALUES ($i, 'Luis$i');";echo; done
for ((i=10001; i<=20000; i++)); do kubectl exec -it myclient2 -- psql -U postgres -h mycluster-rw -p 5432 -d test -c "INSERT INTO t1 VALUES ($i, 'Luis$i');";echo; done
혹은
for ((i=301; i<=10000; i++)); do psql -U postgres -h psql.$MyDomain -p 5432 -d test -c "INSERT INTO t1 VALUES ($i, 'Luis$i');";echo; done
for ((i=10001; i<=20000; i++)); do psql -U postgres -h psql.$MyDomain -p 5432 -d test -c "INSERT INTO t1 VALUES ($i, 'Luis$i');";echo; done

# [터미널4] 파드 삭제 >> INSERT 가 중간에 끊어지나요?
kubectl get pod -l cnpg.io/cluster=mycluster -owide
kubectl delete pvc/mycluster-1 pod/mycluster-1
kubectl cnpg status mycluster

# 파드 정보 확인 : 파드의 이름과 배치 위치 비교 확인
kubectl get pod -l cnpg.io/cluster=mycluster -owide
kubectl get pod -l cnpg.io/cluster=mycluster
NAME          READY   STATUS    RESTARTS   AGE
mycluster-2   1/1     Running   0          125m
mycluster-3   1/1     Running   0          125m
mycluster-4   1/1     Running   0          2m18s

[장애2] 프라이머리 파드(인스턴스) 가 배포된 노드 1대 drain 설정 및 동작 확인

# (옵션) 오퍼레이터 로그 확인
kubectl get pod -n operators -l app.kubernetes.io/name=cloudnative-pg
kubectl logs -n operators -l app.kubernetes.io/name=cloudnative-pg -f

# 워커노드 drain
# kubectl drain <<노드>> --ignore-daemonsets --delete-emptydir-data
kubectl get node
NODE=<각자 자신의 EC2 노드 이름 지정>
NODE=ip-192-168-3-231.ap-northeast-2.compute.internal
kubectl drain k8s-w1 --delete-emptydir-data --force --ignore-daemonsets && kubectl get node -w
혹은
kubectl drain $NODE --delete-emptydir-data --force --ignore-daemonsets && kubectl get node -w

# 클러스터 정보 확인 : 파드가 Pending 된 주 원인은 무엇일까요? >> 예를 들어 동일AZ에 정상 워커노드가 있었다면 어떻게 될까요?
kubectl get pod -owide
kubectl cnpg status mycluster

# 동작 확인 후 uncordon 설정
kubectl uncordon $NODE

파드 볼륨 증가 (온라인) - Link

# 모니터링
watch kubectl get pod,pvc

# PVC 3G → 5G 로 증가 설정 : 증가 후 감소는 안됨 > AWS EBS 증가 확인
kubectl df-pv
kubectl patch cluster mycluster --type=merge -p '{"spec":{"storage":{"size":"5Gi"}}}'
kubectl describe cluster mycluster

primary 파드 변경 : kubectl cnpg promote mycluster <Name>

# 모니터링
watch -d kubectl cnpg status mycluster

# primary 파드 변경
kubectl cnpg promote mycluster mycluster-4

# 확인 : rw 서비스 접속 후 확인
kubectl cnpg status mycluster
kubectl exec -it myclient3 -- psql -U postgres -h mycluster-rw -p 5432 -c "select inet_server_addr();"

5. CloudNativePG Scale & 롤링 업데이트

Scale 테스트

# 정보 확인
kubectl cnpg status mycluster
kubectl get cluster mycluster
NAME        AGE    INSTANCES   READY   STATUS                     PRIMARY
mycluster   167m   3           3       Cluster in healthy state   mycluster-6

# 모니터링
watch kubectl get pod

# 5대로 증가 : 증가 및 join 완료 후 아래 접속 확인
kubectl patch cluster mycluster --type=merge -p '{"spec":{"instances":5}}' && kubectl get pod -l postgresql=mycluster -w
kubectl get cluster mycluster
kubectl cnpg status mycluster

# any 접속 확인
for i in {1..30}; do kubectl exec -it myclient2 -- psql -U postgres -h mycluster-r -p 5432 -c "select inet_server_addr();"; done | sort | uniq -c | sort -nr | grep 192

# 3대로 감소
kubectl patch cluster mycluster --type=merge -p '{"spec":{"instances":3}}' && kubectl get pod -l postgresql=mycluster -w

롤링 업데이트 : Standby 시작 → Primary 갱신 전 SwitchOver(? 옵션 확인 할것) 로 다운타임 최소화 - Updrades Rolling_Updates

# [터미널1] 모니터링
watch kubectl get pod -l cnpg.io/cluster=mycluster

# [터미널2] 모니터링
while true; do kubectl exec -it myclient2 -- psql -U postgres -h mycluster-ro -p 5432 -d test -c "SELECT COUNT(*) FROM t1"; date;sleep 1; done

# [터미널3] test 데이터베이스에 다량의 데이터 INSERT
for ((i=10000; i<=20000; i++)); do kubectl exec -it myclient3 -- psql -U postgres -h mycluster-rw -p 5432 -d test -c "INSERT INTO t1 VALUES ($i, 'Luis$i');";echo; done

# [터미널4] postgresql:15.3 → postgresql:15.4 로 업데이트 >> 순서와 절차 확인
kubectl cnpg status mycluster # Primary 파드와 Image 버전 확인
kubectl patch cluster mycluster --type=merge -p '{"spec":{"imageName":"ghcr.io/cloudnative-pg/postgresql:15.4"}}' && kubectl get pod -l postgresql=mycluster -w

# 확인
kubectl get cluster mycluster
kubectl cnpg status mycluster | grep Image





# 다음 실습을 위해 클러스터 삭제  
kubectl delete cluster mycluster && kubectl delete pod —all

6. CloudNativePG 기타

PgBouncer (오픈소스 Connection Pooling) 지원 : 데이터베이스 액세스 계층 도입, 인증 모니터링 로깅 처리 가능, 연결 재사용

출처 : https://cloudnative-pg.io/documentation/current/connection_pooling/

# 클러스터 신규 설치 : 동기 복제
cat <<EOT> mycluster2.yaml
# Example of PostgreSQL cluster
apiVersion: postgresql.cnpg.io/v1
kind: Cluster
metadata:
  name: mycluster
spec:
  imageName: ghcr.io/cloudnative-pg/postgresql:16.0
  instances: 3  
  storage:
    size: 3Gi
  postgresql:
    pg_hba:
      - host all postgres all trust
  enableSuperuserAccess: true
  minSyncReplicas: 1
  maxSyncReplicas: 2
  monitoring:
    enablePodMonitor: true
EOT

kubectl apply -f mycluster2.yaml && kubectl get pod -w

# 동기 복제 정보 확인
watch kubectl cnpg status mycluster
kubectl cnpg status mycluster
Streaming Replication status
Replication Slots Enabled
Name         Sent LSN   Write LSN  Flush LSN  Replay LSN  Write Lag  Flush Lag  Replay Lag  State      Sync State  Sync Priority  Replication Slot
----         --------   ---------  ---------  ----------  ---------  ---------  ----------  -----      ----------  -------------  ----------------
mycluster-2  0/604DF70  0/604DF70  0/604DF70  0/604DF70   00:00:00   00:00:00   00:00:00    streaming  quorum      1              active
mycluster-3  0/604DF70  0/604DF70  0/604DF70  0/604DF70   00:00:00   00:00:00   00:00:00    streaming  quorum      1              active

Unmanaged Replication Slot Status
No unmanaged replication slots found

Instances status
Name         Database Size  Current LSN  Replication role  Status  QoS         Manager Version  Node
----         -------------  -----------  ----------------  ------  ---         ---------------  ----
mycluster-1  29 MB          0/604DF70    Primary           OK      BestEffort  1.21.0           ip-192-168-2-231.ap-northeast-2.compute.internal
mycluster-2  29 MB          0/604DF70    Standby (sync)    OK      BestEffort  1.21.0           ip-192-168-3-29.ap-northeast-2.compute.internal
mycluster-3  29 MB          0/604DF70    Standby (sync)    OK      BestEffort  1.21.0           ip-192-168-1-211.ap-northeast-2.compute.internal

# 클러스터와 반드시 동일한 네임스페이스에 PgBouncer 파드 설치
cat <<EOT> pooler.yaml
apiVersion: postgresql.cnpg.io/v1
kind: Pooler
metadata:
  name: pooler-rw
spec:
  cluster:
    name: mycluster
  instances: 3
  type: rw
  pgbouncer:
    poolMode: session
    parameters:
      max_client_conn: "1000"
      default_pool_size: "10"
EOT

kubectl apply -f pooler.yaml && kubectl get pod -w

# 확인
kubectl get pooler
kubectl get svc,ep pooler-rw

# superuser 계정 암호
kubectl get secrets mycluster-superuser -o jsonpath={.data.password} | base64 -d ; echo
nhp8ymj6I7lSQcUk08FJtprwJzRR0ZojdCdx4sQbjjKW61JtrWRrMAioqI1xmzWz

# 접속 확인 : pooler 인증 설정이 적용됨!, 반복 접속 시 파드가 변경되는지?
for ((i=1; i<=3; i++)); do PODNAME=myclient$i VERSION=15.3.0 envsubst < myclient.yaml | kubectl apply -f - ; done
kubectl exec -it myclient1 -- psql -U postgres -h pooler-rw -p 5432 -c "select inet_server_addr();"
nhp8ymj6I7lSQcUk08FJtprwJzRR0ZojdCdx4sQbjjKW61JtrWRrMAioqI1xmzWz
...

# (옵션) Monitoring Metrics
kubectl get pod -l cnpg.io/poolerName=pooler-rw -owide
curl <파드IP>:9127/metrics

cat <<EOT> podmonitor-pooler-rw.yaml
apiVersion: monitoring.coreos.com/v1
kind: PodMonitor
metadata:
  name: pooler-rw
spec:
  selector:
    matchLabels:
      cnpg.io/poolerName: pooler-rw
  podMetricsEndpoints:
  - port: metrics
EOT
kubectl apply -f podmonitor-pooler-rw.yaml

현재글Cloud Native PostgreSQL Operator

Today :
Yesterday :

참도다리