Upgrade from 8.1.0.x to 8.1.0.6
1. Upgrade From 8.1.0.x to 8.1.0.6
RDAF Platform: From 8.1.0.x to 8.1.0.6(Selected Services)
RDAF Deployment rdaf CLI: From 1.4.1 to 1.4.1.2
OIA (AIOps) Application: From 8.1.0.x to 8.1.0.6(Selected Services)
RDAF Studio: From 8.1.0.x to 8.1.0.6
1.1. Prerequisites
Before proceeding with this upgrade, please make sure and verify the below prerequisites are met.
-
RDAF Deployment CLI version: 1.4.1
-
Infra Services tag: 1.0.4
-
Platform Services and RDA Worker tag: 8.1.0.x
-
OIA Application Services tag: 8.1.0.x
Note
- Check the Disk space of all the Platform and Service Vm's using the below mentioned command, the highlighted disk size should be less than 80%
rdauser@oia-125-216:~/collab-3.7-upgrade$ df -kh
Filesystem Size Used Avail Use% Mounted on
udev 32G 0 32G 0% /dev
tmpfs 6.3G 357M 6.0G 6% /run
/dev/mapper/ubuntu--vg-ubuntu--lv 48G 12G 34G 26% /
tmpfs 32G 0 32G 0% /dev/shm
tmpfs 5.0M 0 5.0M 0% /run/lock
tmpfs 32G 0 32G 0% /sys/fs/cgroup
/dev/loop0 64M 64M 0 100% /snap/core20/2318
/dev/loop2 92M 92M 0 100% /snap/lxd/24061
/dev/sda2 1.5G 309M 1.1G 23% /boot
/dev/sdf 50G 3.8G 47G 8% /var/mysql
/dev/loop3 39M 39M 0 100% /snap/snapd/21759
/dev/sdg 50G 541M 50G 2% /minio-data
/dev/loop4 92M 92M 0 100% /snap/lxd/29619
/dev/loop5 39M 39M 0 100% /snap/snapd/21465
/dev/sde 15G 140M 15G 1% /zookeeper
/dev/sdd 30G 884M 30G 3% /kafka-logs
/dev/sdc 50G 3.3G 47G 7% /opt
/dev/sdb 50G 29G 22G 57% /var/lib/docker
/dev/sdi 25G 294M 25G 2% /graphdb
/dev/sdh 50G 34G 17G 68% /opensearch
/dev/loop6 64M 64M 0 100% /snap/core20/2379
Warning
Make sure all of the above pre-requisites are met before proceeding with the upgrade process.
Warning
Non-Kubernetes: Upgrading RDAF Platform and AIOps application services is a disruptive operation. Schedule a maintenance window before upgrading RDAF Platform and AIOps services to newer version.
Important
Please make sure full backup of the RDAF platform system is completed before performing the upgrade.
Non-Kubernetes: Please run the below backup command to take the backup of application data.
Note: Please make sure this backup-dir is mounted across all infra,cli vms.- Verify that RDAF deployment
rdafcli version is 1.4.1 on the VM where CLI was installed for docker on-prem registry managing Non-kubernetes deployments.
- On-premise docker registry service version is 1.0.3
ff6b1de8515f cfxregistry.CloudFabrix.io:443/docker-registry:1.0.3 "/entrypoint.sh /bin…" 7 days ago Up 7 days deployment-scripts-docker-registry-1
- RDAF Platform services version is 8.1.0.x
Run the below command to get RDAF Platform services details
- RDAF OIA Application services version is 8.1.0.x
Run the below command to get RDAF App services details
1.1.1 RDAF Deployment CLI Upgrade
- Download the RDAF Deployment CLI's newer version 1.4.1.2 bundle
wget https://macaw-amer.s3.us-east-1.amazonaws.com/releases/rdaf-platform/1.4.1.2/rdafcli-1.4.1.2.tar.gz
- Upgrade the
rdafCLI to version 1.4.1.2
- Verify the installed
rdafCLI version is upgraded to 1.4.1.2
- Download the RDAF Deployment CLI's newer version 1.4.1.2 bundle and copy it to RDAF management VM on which
rdaf & rdafk8sdeployment CLI was installed.
1.1.2 Pre-Upgrade: MariaDB Health Check and FSM Index Setup
- Check MariaDB Memory Capacity
Get the current memory capacity allocated to MariaDB. Go to each node where MariaDB is deployed and run the below given docker command.
rdauser@infra108122:~$ docker stats --no-stream | grep mariadb
afb5b248bc1d infra-mariadb-1 0.15% 320.9MiB / 8GiB 3.92% 0B / 0B 15.5MB / 579MB 30
Note
By default, the memory allocated to MariaDB is 8GB. If the environment is managing a higher load, it is recommended to increase the MariaDB container memory to 16GB or higher.
- Create Single Column Index in FSM Transition Table
Get the current number of rows and create a single column index in the FSM transition table.
Login to the MariaDB running instance using the below command from the CLI VM.
rdauser@infra108122:~$ mysql -u rdaf -prdaf123! -h <virtualip> -P 3307
mysql: [Warning] Using a password on the command line interface can be insecure.
Welcome to the MySQL monitor. Commands end with ; or \g.
Your MySQL connection id is 49622
Server version: 11.4.5-MariaDB-log Source distribution
Once logged in, verify the available databases
+-----------------------------------------------------------------+
| Database |
+-----------------------------------------------------------------+
| 7280ba4c39af4e068598875c5f01fbe3_alert_processor |
| 7280ba4c39af4e068598875c5f01fbe3_cfx_app_controller |
| 7280ba4c39af4e068598875c5f01fbe3_cfxdimensions_app_access_manag |
| 7280ba4c39af4e068598875c5f01fbe3_cfxdimensions_app_collaboratio |
| 7280ba4c39af4e068598875c5f01fbe3_cfxdimensions_app_file_browser |
| 7280ba4c39af4e068598875c5f01fbe3_cfxdimensions_app_irm_service |
| 7280ba4c39af4e068598875c5f01fbe3_cfxdimensions_app_notification |
| 7280ba4c39af4e068598875c5f01fbe3_cfxdimensions_app_resource_man |
| 7280ba4c39af4e068598875c5f01fbe3_configuration_service |
| 7280ba4c39af4e068598875c5f01fbe3_fsm |
| 7280ba4c39af4e068598875c5f01fbe3_identity |
| 7280ba4c39af4e068598875c5f01fbe3_ml_config |
| 7280ba4c39af4e068598875c5f01fbe3_rda |
| 7280ba4c39af4e068598875c5f01fbe3_services_state |
| 7280ba4c39af4e068598875c5f01fbe3_user_preferences |
| information_schema |
| mysql |
| performance_schema |
| saasportal |
| sys |
| test |
+-----------------------------------------------------------------+
21 rows in set (0.00 sec)
Switch to the FSM database
Reading table information for completion of table and column names
You can turn off this feature to get a quicker startup with -A
Database changed
Verify the tables available in the FSM database
+------------------------------------------------+
| Tables_in_7280ba4c39af4e068598875c5f01fbe3_fsm |
+------------------------------------------------+
| timer |
| transition |
+------------------------------------------------+
2 rows in set (0.00 sec)
Check the current number of rows in the transition table using the command given below
Note
If the number of rows is equal to or greater than 1 lakh(1,00,000), please contact the Fabrix.ai support team to create indexes on the above table.
Create the required indexes on the transition table using the below queries.
Query OK, 0 rows affected (0.01 sec)
Records: 0 Duplicates: 0 Warnings: 0
Query OK, 0 rows affected, 1 warning (0.01 sec)
Records: 0 Duplicates: 0 Warnings: 1
Verify the indexes have been created successfully using the command given below.
+------------+------------+---------------------------------------------------+--------------+---------------------------+-----------+-------------+----------+--------+------+------------+---------+---------------+---------+
| Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment | Ignored |
+------------+------------+---------------------------------------------------+--------------+---------------------------+-----------+-------------+----------+--------+------+------------+---------+---------------+---------+
| transition | 0 | PRIMARY | 1 | uuid | A | 85722 | NULL | NULL | | BTREE | | | NO |
| transition | 0 | PRIMARY | 2 | current_state | A | 171445 | NULL | NULL | | BTREE | | | NO |
| transition | 1 | idx_uuid_is_final_state_last_transition_timestamp | 1 | uuid | A | 171445 | NULL | NULL | | BTREE | | | NO |
| transition | 1 | idx_uuid_is_final_state_last_transition_timestamp | 2 | is_final_state | A | 171445 | NULL | NULL | | BTREE | | | NO |
| transition | 1 | idx_uuid_is_final_state_last_transition_timestamp | 3 | last_transition_timestamp | A | 171445 | NULL | NULL | YES | BTREE | | | NO |
| transition | 1 | idx_finalstate_ts_uuid | 1 | is_final_state | A | 1 | NULL | NULL | | BTREE | | | NO |
| transition | 1 | idx_finalstate_ts_uuid | 2 | last_transition_timestamp | A | 85722 | NULL | NULL | YES | BTREE | | | NO |
| transition | 1 | idx_finalstate_ts_uuid | 3 | uuid | A | 171445 | NULL | NULL | | BTREE | | | NO |
| transition | 1 | idx_uuid_solo | 1 | uuid | A | 171445 | NULL | NULL | | BTREE | | | NO |
+------------+------------+---------------------------------------------------+--------------+---------------------------+-----------+-------------+----------+--------+------+------------+---------+---------------+---------+
9 rows in set (0.00 sec)
Once the indexes are verified, exit the MariaDB session
1.2. Upgrade Steps
1.2.1 Download the new Docker Images
Login into the VM where rdaf deployment CLI was installed for docker on-premise registry and managing Non-kubernetes deployment.
Download the new docker image tags for RDAF Platform and OIA (AIOps) Application services and wait until all of the images are downloaded.
Note
If the Download of the images fail, Please re-execute the above command
Run the below command to verify above mentioned tags are downloaded for all of the RDAF Platform and OIA (AIOps) Application services.
Please make sure 8.1.0.6 image tag is downloaded for the below RDAF Platform services.
- rda-api-server
- rda-scheduler
- rda-collector
- rda-fsm
- cfx-rda-access-manager
- portal-backend
- portal-frontend
- rda-worker-all
- rda-studio
Please make sure 8.1.0.6 image tag is downloaded for the below RDAF OIA (AIOps) Application services.
- cfx-rda-app-controller
- cfx-rda-alert-processor
- cfx-rda-irm-service
Downloaded Docker images are stored under the below path.
/opt/rdaf-registry/data/docker/registry/v2/ or /opt/rdaf/data/docker/registry/v2/
Run the below command to check the filesystem's disk usage on offline registry VM where docker images are pulled.
If necessary, older image tags that are no longer in use can be deleted to free up disk space using the command below.
Note
Run the command below if /opt occupies more than 80% of the disk space or if the free capacity of /opt is less than 25GB.
1.2.2 Upgrade RDAF Platform Services
Warning
For Non-Kubernetes deployment, upgrading RDAF Platform and AIOps application services is a disruptive operation when rolling-upgrade option is not used. Please schedule a maintenance window before upgrading RDAF Platform and AIOps services to newer version.
Run the below command to initiate upgrading RDAF Platform services with zero downtime
rdaf platform upgrade --tag 8.1.0.6 --service rda_api_server --service rda_scheduler --service rda_collector --service rda_fsm --service cfx-rda-access-manager --service portal-backend --service portal-frontend --rolling-upgrade --timeout 10
Note
timeout <10> mentioned in the above command represents as Seconds
Note
The rolling-upgrade option upgrades the Platform services running in high-availability mode on one VM at a time in sequence. It completes the upgrade of Platform services running on VM-1 before upgrading them on VM-2, followed by VM-3, and so on.
During this upgrade sequence, RDAF platform continues to function without any impact to the application traffic.
After completing the Platform services upgrade on all VMs, it will ask for user confirmation to delete the older version Platform service PODs. The user has to provide YES to delete the old docker containers (in non-k8s)
192.168.108.122:5000/ubuntu-rda-client-api-server:8.1.0.6
2025-10-30 10:32:43,693 [rdaf.component.platform] INFO - Gathering platform container details.
2025-10-30 10:32:44,246 [rdaf.component.platform] INFO - Gathering rdac pod details.
+----------+------------+---------+---------+--------------+-------------+------------+
| Pod ID | Pod Type | Version | Age | Hostname | Maintenance | Pod Status |
+----------+------------+---------+---------+--------------+-------------+------------+
| 7a4a239a | collector | 8.1.0.x | 4:02:59 | a46780fed2d5 | None | True |
| fc561b34 | asm | 8.1.0.x | 4:01:55 | bf6e4f48aa1e | None | True |
| 16bfb91a | api-server | 8.1.0.x | 4:04:33 | 51faaac76f07 | None | True |
+----------+------------+---------+---------+--------------+-------------+------------+
Continue moving above pods to maintenance mode? [yes/no]: yes
2025-10-30 10:33:00,533 [rdaf.component.platform] INFO - Initiating Maintenance Mode...
2025-10-30 10:33:25,621 [rdaf.component.platform] INFO - Following container are in maintenance mode
+----------+------------+---------+---------+--------------+-------------+------------+
| Pod ID | Pod Type | Version | Age | Hostname | Maintenance | Pod Status |
+----------+------------+---------+---------+--------------+-------------+------------+
| 16bfb91a | api-server | 8.1.0.x | 4:05:08 | 51faaac76f07 | maintenance | False |
| fc561b34 | asm | 8.1.0.x | 4:02:30 | bf6e4f48aa1e | maintenance | False |
| 7a4a239a | collector | 8.1.0.x | 4:03:34 | a46780fed2d5 | maintenance | False |
+----------+------------+---------+---------+--------------+-------------+------------+
2025-10-30 10:33:25,622 [rdaf.component.platform] INFO - Waiting for timeout of 2 seconds...
2025-10-30 10:33:27,622 [rdaf.component.platform] INFO - Upgrading service: rda_collector on host 192.168.
Run the below command to initiate upgrading RDAF Platform services without zero downtime
Please wait till all of the new platform services are in Up state and run the below command to verify their status and make sure all of them are running with 8.1.0.6 version.
+--------------------------+----------------+-------------------------------+--------------+---------+
| Name | Host | Status | Container Id | Tag |
+--------------------------+----------------+-------------------------------+--------------+---------+
| rda_api_server | 192.168.108.51 | Up 4 hours | dc2dd806e6a6 | 8.1.0.6 |
| rda_api_server | 192.168.108.52 | Up 4 hours | a76257df0330 | 8.1.0.6 |
| rda_registry | 192.168.108.51 | Up 4 hours | f23455c6b85b | 8.1.0.1 |
| rda_registry | 192.168.108.52 | Up 4 hours | 3b8deb15ad1f | 8.1.0.1 |
| rda_scheduler | 192.168.108.51 | Up 4 hours | 1864f7e88bfb | 8.1.0.6 |
| rda_scheduler | 192.168.108.52 | Up 4 hours | 62089081e902 | 8.1.0.6 |
| rda_collector | 192.168.108.51 | Up 4 hours | 50c81f436fd9 | 8.1.0.6 |
| rda_collector | 192.168.108.52 | Up 4 hours | 754db49f2804 | 8.1.0.6 |
| rda_identity | 192.168.108.51 | Up 4 hours | 37625fde83e8 | 8.1.0.1 |
| rda_identity | 192.168.108.52 | Up 4 hours | bb60423a47fa | 8.1.0.1 |
| rda_asm | 192.168.108.51 | Up 4 hours | 5ae15e7d661e | 8.1.0.1 |
| rda_asm | 192.168.108.52 | Up 4 hours | 80181bb0f80e | 8.1.0.1 |
| rda_fsm | 192.168.108.51 | Up 4 hours | bfaf7206eacb | 8.1.0.6 |
| rda_fsm | 192.168.108.52 | Up 4 hours | 8c470b9d7b08 | 8.1.0.6 |
+--------------------------+----------------+-------------------------------+--------------+---------+
Run the below command to check the rda-scheduler service is elected as a leader under Site column.
Run the below command to check if all services has ok status and does not throw any failure messages.
+-----------+----------------------------------------+--------------+----------+-------------+-----------------------------------------------------+----------+-------------------------------------------------------------+
| Cat | Pod-Type | Host | ID | Site | Health Parameter | Status | Message |
|-----------+----------------------------------------+--------------+----------+-------------+-----------------------------------------------------+----------+-------------------------------------------------------------|
| rda_app | alert-ingester | 7f75047e9e44 | daa8c414 | | service-status | ok | |
| rda_app | alert-ingester | 7f75047e9e44 | daa8c414 | | minio-connectivity | ok | |
| rda_app | alert-ingester | 7f75047e9e44 | daa8c414 | | service-dependency:configuration-service | ok | 2 pod(s) found for configuration-service |
| rda_app | alert-ingester | 7f75047e9e44 | daa8c414 | | service-initialization-status | ok | |
| rda_app | alert-ingester | 7f75047e9e44 | daa8c414 | | kafka-connectivity | ok | Cluster=NTc1NWU1MTQxYmY3MTFlZg, Broker=1, Brokers=[1, 2, 3] |
| rda_app | alert-ingester | f9ec55862be0 | f9b9231c | | service-status | ok | |
| rda_app | alert-ingester | f9ec55862be0 | f9b9231c | | minio-connectivity | ok | |
| rda_app | alert-ingester | f9ec55862be0 | f9b9231c | | service-dependency:configuration-service | ok | 2 pod(s) found for configuration-service |
| rda_app | alert-ingester | f9ec55862be0 | f9b9231c | | service-initialization-status | ok | |
| rda_app | alert-ingester | f9ec55862be0 | f9b9231c | | kafka-connectivity | ok | Cluster=NTc1NWU1MTQxYmY3MTFlZg, Broker=3, Brokers=[1, 2, 3] |
| rda_app | alert-processor | c6cc7b04ab33 | b4ebfb06 | | service-status | ok | |
| rda_app | alert-processor | c6cc7b04ab33 | b4ebfb06 | | minio-connectivity | ok | |
+-----------+----------------------------------------+--------------+----------+-------------+-----------------------------------------------------+----------+-------------------------------------------------------------+
1.2.3 Upgrade RDA Worker Services
Note
If the worker was deployed in a HTTP proxy environment, please make sure the required HTTP proxy environment variables are added in /opt/rdaf/deployment-scripts/values.yaml file under rda_worker configuration section as shown below before upgrading RDA Worker services.
rda_worker:
mem_limit: 8G
memswap_limit: 8G
privileged: false
environment:
RDA_ENABLE_TRACES: 'no'
RDA_SELF_HEALTH_RESTART_AFTER_FAILURES: 3
http_proxy: "http://test:1234@192.168.122.107:3128"
https_proxy: "http://test:1234@192.168.122.107:3128"
HTTP_PROXY: "http://test:1234@192.168.122.107:3128"
HTTPS_PROXY: "http://test:1234@192.168.122.107:3128"
- Upgrade RDA Worker Services
Please run the below command to initiate upgrading the RDA Worker Service with zero downtime
Note
If the worker is deployed in a proxy environment, add the required environment proxy variables in /opt/rdaf/deployment-scripts/values.yaml, under the section rda_worker -> env:, instead of making changes to worker.yaml (Recommended only if there are any new changes needed for the worker)
Note
timeout <10> mentioned in the above command represents as seconds
Note
The rolling-upgrade option upgrades the Worker services running in high-availability mode on one VM at a time in sequence. It completes the upgrade of Worker services running on VM-1 before upgrading them on VM-2, followed by VM-3, and so on.
After completing the Worker services upgrade on all VMs, it will ask for user confirmation, the user has to provide YES to delete the older version Worker service PODs.
Digest: sha256:728962901928f166dfc1a3d5d7ad931c133621d1abac598e140af3249905836c
Status: Downloaded newer image for 192.168.108.122:5000/ubuntu-rda-worker-all:8.1.0.6
192.168.108.122:5000/ubuntu-rda-worker-all:8.1.0.6
2025-10-30 10:52:34,199 [rdaf.component.worker] INFO - Collecting worker details for rolling upgrade
2025-10-30 10:52:45,508 [rdaf.component.worker] INFO - Rolling upgrade worker on 192.168.108.127
+----------+----------+---------+---------+--------------+-------------+------------+
| Pod ID | Pod Type | Version | Age | Hostname | Maintenance | Pod Status |
+----------+----------+---------+---------+--------------+-------------+------------+
| c2544331 | worker | 8.1.0.3 | 4:10:38 | 5fac8ebe85fe | None | True |
+----------+----------+---------+---------+--------------+-------------+------------+
Continue moving above pod to maintenance mode? [yes/no]: yes
2025-10-30 10:54:30,732 [rdaf.component.worker] INFO - Initiating maintenance mode for pod c2544331
2025-10-30 10:54:52,747 [rdaf.component.worker] INFO - Following worker container is in maintenance mode
+----------+----------+---------+---------+--------------+-------------+------------+
| Pod ID | Pod Type | Version | Age | Hostname | Maintenance | Pod Status |
+----------+----------+---------+---------+--------------+-------------+------------+
| c2544331 | worker | 8.1.0.3 | 4:12:43 | 5fac8ebe85fe | maintenance | False
Please run the below command to initiate upgrading the RDA Worker Service without zero downtime
Please wait for 120 seconds to let the newer version of RDA Worker service containers join the RDA Fabric appropriately. Run the below commands to verify the status of the newer RDA Worker service containers.
| Infra | worker | True | 6eff605e72c4 | a318f394 | rda-site-01 | 13:45:13 | 4 | 31.21 | 0 | 0 |
| Infra | worker | True | ae7244d0d10a | 554c2cd8 | rda-site-01 | 13:40:40 | 4 | 31.21 | 0 | 0 |
+------------+----------------+------------+--------------+---------+
| Name | Host | Status | Container Id | Tag |
+------------+----------------+------------+--------------+---------+
| rda_worker | 192.168.108.53 | Up 4 hours | ea187f89505f | 8.1.0.6 |
| rda_worker | 192.168.108.54 | Up 4 hours | a62b3230bbaa | 8.1.0.6 |
+------------+----------------+------------+--------------+---------+
+-----------+----------------------------------------+--------------+----------+-------------+-----------------------------------------------------+----------+-----------------------------------------------------------------------------------------------------------------------------+
| Cat | Pod-Type | Host | ID | Site | Health Parameter | Status | Message |
|-----------+----------------------------------------+--------------+----------+-------------+-----------------------------------------------------+----------+-----------------------------------------------------------------------------------------------------------------------------|
| rda_infra | api-server | 1b0542719618 | 1845ae67 | | service-status | ok | |
| rda_infra | api-server | 1b0542719618 | 1845ae67 | | minio-connectivity | ok | |
| rda_infra | api-server | d4404cffdc7a | a4cfdc6d | | service-status | ok | |
| rda_infra | api-server | d4404cffdc7a | a4cfdc6d | | minio-connectivity | ok | |
| rda_infra | asm | 8d3d52a7a475 | 418c9dc1 | | service-status | ok | |
| rda_infra | asm | 8d3d52a7a475 | 418c9dc1 | | minio-connectivity | ok | |
| rda_infra | asm | ab172a9b8229 | 2ac1d67a | | service-status | ok | |
| rda_infra | asm | ab172a9b8229 | 2ac1d67a | | minio-connectivity | ok | |
| rda_app | asset-dependency | 6ac69ca1085c | c2e9dcb9 | | service-status | ok | |
| rda_app | asset-dependency | 6ac69ca1085c | c2e9dcb9 | | minio-connectivity | ok | |
| rda_app | asset-dependency | 58a5f4f460d3 | 0b91caac | | service-status | ok | |
| rda_app | asset-dependency | 58a5f4f460d3 | 0b91caac | | minio-connectivity | ok | |
| rda_app | authenticator | 9011c2aef498 | 9f7efdc3 | | service-status | ok | |
| rda_app | authenticator | 9011c2aef498 | 9f7efdc3 | | minio-connectivity | ok | |
| rda_app | authenticator | 9011c2aef498 | 9f7efdc3 | | DB-connectivity | ok | |
| rda_app | authenticator | 148621ed8c82 | dbf16b82 | | service-status | ok | |
| rda_app | authenticator | 148621ed8c82 | dbf16b82 | | minio-connectivity | ok | |
| rda_app | authenticator | 148621ed8c82 | dbf16b82 | | DB-connectivity | ok | |
| rda_app | cfx-app-controller | 75ec0f30cfa3 | 1198fdee | | service-status | ok | |
| rda_app | cfx-app-controller | 75ec0f30cfa3 | 1198fdee | | minio-connectivity | ok | |
| rda_app | cfx-app-controller | 75ec0f30cfa3 | 1198fdee | | service-initialization-status | ok | |
| rda_app | cfx-app-controller | 75ec0f30cfa3 | 1198fdee | | DB-connectivity | ok |
+-----------+----------------------------------------+--------------+----------+-------------+-----------------------------------------------------+----------+-----------------------------------------------------------------------------------------------------------------------------+
1.2.4 Upgrade OIA Application Services
Run the below commands to initiate upgrading the RDA Fabric OIA Application services with zero downtime
rdaf app upgrade OIA --tag 8.1.0.6 --service cfx-rda-app-controller --service cfx-rda-alert-processor --service cfx-rda-alert-correlator --service cfx-rda-irm-service --rolling-upgrade --timeout 10
Note
timeout <10> mentioned in the above command represents as Seconds
Note
The rolling-upgrade option upgrades the OIA application services running in high-availability mode on one VM at a time in sequence. It completes the upgrade of OIA application services running on VM-1 before upgrading them on VM-2, followed by VM-3, and so on.
After completing the OIA application services upgrade on all VMs, it will ask for user confirmation to delete the older version OIA application service PODs.
2025-10-30 11:23:36,923 [rdaf.component.oia] INFO - Gathering OIA app container details.
2025-10-30 11:23:38,026 [rdaf.component.oia] INFO - Gathering rdac pod details.
+----------+-----------------------+---------+---------+--------------+-------------+------------+
| Pod ID | Pod Type | Version | Age | Hostname | Maintenance | Pod Status |
+----------+-----------------------+---------+---------+--------------+-------------+------------+
| 1c14f6eb | alert-ingester | 8.1.0.x | 4:29:25 | c236e62139f8 | None | True |
| 66ed0060 | alert-processor | 8.1.0.x | 4:27:14 | c6672a224c54 | None | True |
| b95a768f | event-consumer | 8.1.0.x | 4:27:48 | 3f4516b3e057 | None | True |
| acecb0b7 | alert-processor- | 8.1.0.x | 4:23:44 | 02cf84e94ea7 | None | True |
| | companion | | | | | |
| 75c30a77 | alert-correlator | 8.1.0.x | 4:26:41 | 895c6b108728 | None | True |
| 73cc3ae8 | cfxdimensions-app- | 8.1.0.x | 4:24:55 | b8d988286bf3 | None | True |
| | collaboration | | | | | |
+----------+-----------------------+---------+---------+--------------+-------------+------------+
Continue moving above pods to maintenance mode? [yes/no]: yes
2025-10-30 11:23:55,371 [rdaf.component.oia] INFO - Initiating Maintenance Mode...
2025-10-30 11:24:18,171 [rdaf.component.oia] INFO - Following container are in maintenance mode
+----------+-----------------------+---------+---------+--------------+-------------+------------+
| Pod ID | Pod Type | Version | Age | Hostname | Maintenance | Pod Status |
+----------+-----------------------+---------+---------+--------------+-------------+------------+
| 75c30a77 | alert-correlator | 8.1.0.x | 4:27:06 | 895c6b108728 | maintenance | False |
| 1c14f6eb | alert-ingester | 8.1.0.x | 4:29:50 | c236e62139f8 | maintenance | False |
| 66ed0060 | alert-processor | 8.1.0.x | 4:27:39 | c6672a224c54 | maintenance | False |
| acecb0b7 | alert-processor- | 8.1.0.x | 4:24:09 | 02cf84e94ea7 | maintenance | False |
| | companion | | | | | |
| 73cc3ae8 | cfxdimensions-app- | 8.1.0.x | 4:25:21 | b8d988286bf3 | maintenance | False |
| | collaboration | | | | | |
| b95a768f | event-consumer | 8.1.0.x | 4:28:13 | 3f4516b3e057 | maintenance | False |
+----------+-----------------------+---------+---------+--------------+-------------+------------+
Run the below command to initiate upgrading the RDA Fabric OIA Application services without zero downtime
Please wait till all of the new OIA application service containers are in Up state and run the below command to verify their status and make sure they are running with 8.1.0.6 version.
+--------------------------+-----------------+-----------+--------------+---------+
| Name | Host | Status | Container Id | Tag |
+--------------------------+-----------------+-----------+--------------+---------+
| cfx-rda-app-controller | 192.168.108.127 | Up 6 days | f6447547ee74 | 8.1.0.6 |
| cfx-rda-app-controller | 192.168.108.128 | Up 6 days | a0b93ed591b4 | 8.1.0.6 |
| cfx-rda-alert-processor | 192.168.108.127 | Up 6 days | 8a7ccbe6fce6 | 8.1.0.6 |
| cfx-rda-alert-processor | 192.168.108.128 | Up 6 days | b1cbecda63cb | 8.1.0.6 |
| cfx-rda-alert-correlator | 192.168.108.127 | Up 6 days | 5d8316cf08b4 | 8.1.0.6 |
| cfx-rda-alert-correlator | 192.168.108.128 | Up 6 days | 863f40e26df1 | 8.1.0.6 |
| cfx-rda-irm-service | 192.168.108.127 | Up 6 days | 164f38367d0e | 8.1.0.6 |
| cfx-rda-irm-service | 192.168.108.128 | Up 6 days | 4ea62df19baf | 8.1.0.6 |
+--------------------------+-----------------+-----------+--------------+---------+
Run the below command to verify all OIA application services are up and running.
+-------+----------------------------------------+-------------+----------------+----------+-------------+----------+--------+--------------+---------------+--------------+
| Cat | Pod-Type | Pod-Ready | Host | ID | Site | Age | CPUs | Memory(GB) | Active Jobs | Total Jobs |
|-------+----------------------------------------+-------------+----------------+----------+-------------+----------+--------+--------------+---------------+--------------|
| App | alert-ingester | True | rda-alert-inge | 6a6e464d | | 19:22:36 | 8 | 31.33 | | |
| App | alert-ingester | True | rda-alert-inge | 7f6b42a0 | | 19:22:53 | 8 | 31.33 | | |
| App | alert-processor | True | rda-alert-proc | a880e491 | | 19:23:21 | 8 | 31.33 | | |
| App | alert-processor | True | rda-alert-proc | b684609e | | 19:23:18 | 8 | 31.33 | | |
| App | alert-processor-companion | True | rda-alert-proc | 874f3b33 | | 19:22:24 | 8 | 31.33 | | |
| App | alert-processor-companion | True | rda-alert-proc | 70cadaa7 | | 19:22:05 | 8 | 31.33 | | |
| App | asset-dependency | True | rda-asset-depe | bde06c15 | | 19:47:50 | 8 | 31.33 | | |
| App | asset-dependency | True | rda-asset-depe | 47b9eb02 | | 19:47:38 | 8 | 31.33 | | |
| App | authenticator | True | rda-identity-d | faa33e1b | | 19:47:52 | 8 | 31.33 | | |
| App | authenticator | True | rda-identity-d | 36083c36 | | 19:47:46 | 8 | 31.33 | | |
| App | cfx-app-controller | True | rda-app-contro | 5fd3c3f4 | | 19:23:09 | 8 | 31.33 | | |
| App | cfx-app-controller | True | rda-app-contro | d66e5ce8 | | 19:22:56 | 8 | 31.33 | | |
| App | cfxdimensions-app-access-manager | True | rda-access-man | ecbb535c | | 19:47:46 | 8 | 31.33 | | |
| App | cfxdimensions-app-access-manager | True | rda-access-man | 9a05db5a | | 19:47:36 | 8 | 31.33 | | |
| App | cfxdimensions-app-collaboration | True | rda-collaborat | 61b3c53b | | 19:22:18 | 8 | 31.33 | | |
| App | cfxdimensions-app-collaboration | True | rda-collaborat | 09b9474e | | 19:21:57 | 8 | 31.33 | | |
| App | cfxdimensions-app-file-browser | True | rda-file-brows | 00495640 | | 19:22:45 | 8 | 31.33 | | |
| App | cfxdimensions-app-file-browser | True | rda-file-brows | 640f0653 | | 19:22:29 | 8 | 31.33 | | |
| App | cfxdimensions-app-irm_service | True | rda-irm-servic | 27e345c5 | | 19:21:43 | 8 | 31.33 | | |
| App | cfxdimensions-app-irm_service | True | rda-irm-servic | 23c7e082 | | 19:21:56 | 8 | 31.33 | | |
| App | cfxdimensions-app-notification-service | True | rda-notificati | bbb5b08b | | 19:23:20 | 8 | 31.33 | | |
| App | cfxdimensions-app-notification-service | True | rda-notificati | 9841bcb5 | | 19:23:02 | 8 | 31.33 | | |
+-------+----------------------------------------+-------------+----------------+----------+-------------+----------+--------+--------------+---------------+--------------+
Run the below command to check if all services has ok status and does not throw any failure messages.
+-----------+----------------------------------------+--------------+----------+-------------+-----------------------------------------------------+----------+-------------------------------------------------------------+
| Cat | Pod-Type | Host | ID | Site | Health Parameter | Status | Message |
|-----------+----------------------------------------+--------------+----------+-------------+-----------------------------------------------------+----------+-------------------------------------------------------------|
| rda_app | alert-ingester | 7f75047e9e44 | daa8c414 | | service-status | ok | |
| rda_app | alert-ingester | 7f75047e9e44 | daa8c414 | | minio-connectivity | ok | |
| rda_app | alert-ingester | 7f75047e9e44 | daa8c414 | | service-dependency:configuration-service | ok | 2 pod(s) found for configuration-service |
| rda_app | alert-ingester | 7f75047e9e44 | daa8c414 | | service-initialization-status | ok | |
| rda_app | alert-ingester | 7f75047e9e44 | daa8c414 | | kafka-connectivity | ok | Cluster=NTc1NWU1MTQxYmY3MTFlZg, Broker=1, Brokers=[1, 2, 3] |
| rda_app | alert-ingester | f9ec55862be0 | f9b9231c | | service-status | ok | |
| rda_app | alert-ingester | f9ec55862be0 | f9b9231c | | minio-connectivity | ok | |
| rda_app | alert-ingester | f9ec55862be0 | f9b9231c | | service-dependency:configuration-service | ok | 2 pod(s) found for configuration-service |
| rda_app | alert-ingester | f9ec55862be0 | f9b9231c | | service-initialization-status | ok | |
| rda_app | alert-ingester | f9ec55862be0 | f9b9231c | | kafka-connectivity | ok | Cluster=NTc1NWU1MTQxYmY3MTFlZg, Broker=2, Brokers=[1, 2, 3] |
| rda_app | alert-processor | c6cc7b04ab33 | b4ebfb06 | | service-status | ok | |
| rda_app | alert-processor | c6cc7b04ab33 | b4ebfb06 | | minio-connectivity | ok | |
+-----------+----------------------------------------+--------------+----------+-------------+-----------------------------------------------------+----------+-------------------------------------------------------------+
1.2.5 Upgrade Script
- Please download the below python script (
rdaf_upgrade_1411_1412.py)
wget https://macaw-amer.s3.us-east-1.amazonaws.com/releases/rdaf-platform/1.4.1.2/rdaf_upgrade_1411_1412.py
- Execute the following upgrade script on the CLI VM.
After running the upgrade script, it will automatically back up the existing my_custom.cnf and haproxy.cfg files across all VMs where they are deployed, and generate new versions of both files incorporating the required changes. The updated configurations produced by the script are documented below.
haproxy.cfg & my_custom.cnf
Note
Click below to view the configuration changes that have been highlighted for haproxy.cfg and my_custom.cnf.
haproxy.cfg
global
nbthread 8
cpu-map auto:1-8 0-7
maxconn 20000
log 127.0.0.1 local2
chroot /var/lib/haproxy
stats socket /run/haproxy/admin.sock mode 660 level admin
stats timeout 30s
user haproxy
group haproxy
external-check
insecure-fork-wanted
ssl-default-bind-options no-sslv3 no-tls-tickets force-tlsv12
ssl-default-bind-ciphers EECDH+AESGCM:EDH+AESGCM:AES256+EECDH:AES256+EDH
ssl-server-verify none
tune.ssl.default-dh-param 2048
tune.ssl.cachesize 100000
tune.ssl.lifetime 600
tune.ssl.maxrecord 16384
tune.http.maxhdr 1024
max-spread-checks 5s
spread-checks 10
defaults
log global
mode http
retries 3
maxconn 20000
timeout connect 5s
timeout client 900s
timeout server 900s
timeout http-request 60s
timeout http-keep-alive 60s
timeout queue 30s
option httplog
option log-separate-errors
option log-health-checks
option redispatch
option http-keep-alive
option forwardfor
option tcp-smart-accept
option tcp-smart-connect
frontend stats
mode http
bind *:7222
stats enable
stats uri /stats
stats refresh 60s
frontend minio
mode http
bind *:9443 ssl crt /opt/certificates/haproxy.pem alpn h2,http/1.1
http-request set-header X-Forwarded-Port %[dst_port]
http-request set-header X-Forwarded-Proto https if { ssl_fc }
redirect scheme https if !{ ssl_fc }
default_backend minio
frontend mariadb
bind *:3307
timeout client 660s
mode tcp
default_backend mariadb
maxconn 2000
frontend portal
bind *:80
bind *:443 ssl crt /opt/certificates/haproxy.pem
acl WEBHOOK_PATH path_beg -i /webhooks/
use_backend webhook if WEBHOOK_PATH
timeout client 30s
mode http
rate-limit sessions 250
http-request set-header X-Forwarded-Port %[dst_port]
http-request set-header X-Forwarded-Proto https if { ssl_fc }
redirect scheme https unless { ssl_fc }
default_backend portal
backend portal
mode http
balance roundrobin
http-check send meth HEAD uri /
http-check expect rstatus (2|3)[0-9][0-9]
http-check disable-on-404
default-server inter 10s downinter 5s
cookie rdafportal insert indirect nocache maxidle 30m maxlife 24h
# Issue: https://github.com/cloudfabrix/rda/issues/2560
# On a setup where ui prefix is configured but portal is accessed from internal directly,
# Some icons may be addressed using aiops or aips-uat prefix.
# Which will not work because they are not going through nginx proxy replacement
# These requests directly come to haproxy instead of through nginx running on 8443
http-request set-path "%[path,regsub(^/aiops/,/)]"
http-request set-path "%[path,regsub(^aiops/,/)]"
http-request set-path "%[path,regsub(^/aiops-uat/,/)]"
http-request set-path "%[path,regsub(^aiops-uat/,/)]"
server portal-192.168.108.125 192.168.108.125:7780 check cookie rdaf-portal-1
server portal-192.168.108.126 192.168.108.126:7780 check cookie rdaf-portal-2
frontend rda-server
bind *:8808
mode http
http-request set-header X-Forwarded-Port %[dst_port]
default_backend rda-server
backend rda-server
mode http
balance roundrobin
http-check send meth HEAD uri /rdac
http-check expect rstatus (2|3)[0-9][0-9]
http-check disable-on-404
default-server inter 10s downinter 5s
server rda-api-server-192.168.108.125 192.168.108.125:8807
server rda-api-server-192.168.108.126 192.168.108.126:8807
backend mariadb
mode tcp
balance roundrobin
option tcpka
timeout server 660s
timeout connect 10s
default-server inter 10s downinter 5s
option external-check
external-check command /maria_cluster_check
server mariadb-192.168.108.122 192.168.108.122:3306 check backup
server mariadb-192.168.108.123 192.168.108.123:3306 check
server mariadb-192.168.108.124 192.168.108.124:3306 check backup
backend minio
mode http
balance roundrobin
option forwardfor
http-check send meth HEAD uri /minio/health/live ver HTTP/1.1 hdr host localhost
http-check expect rstatus (2|3)[0-9][0-9]
http-check disable-on-404
default-server inter 10s downinter 5s
cookie mnserverid insert indirect nocache maxidle 30m maxlife 24h
server minio-192.168.108.122 192.168.108.122:9000 check cookie rdaf-objstr-1
server minio-192.168.108.123 192.168.108.123:9000 check cookie rdaf-objstr-2
server minio-192.168.108.124 192.168.108.124:9000 check cookie rdaf-objstr-3
server minio-192.168.108.125 192.168.108.125:9000 check cookie rdaf-objstr-4
backend webhook
mode http
balance roundrobin
stick-table type ip size 10k expire 10m
stick on src
option httpchk GET /healthcheck
http-check expect rstatus (2|3)[0-9][0-9]
http-check disable-on-404
http-response set-header Cache-Control no-store
http-response set-header Pragma no-cache
default-server inter 10s downinter 5s fall 3 rise 2
cookie SERVERID insert indirect nocache maxidle 30m maxlife 24h httponly secure
server rdaf-webhook-1 192.168.108.127:8888 check cookie rdaf-webhook-1
server rdaf-webhook-2 192.168.108.128:8888 check cookie rdaf-webhook-2
frontend smtp
bind 0.0.0.0:25
mode tcp
timeout client 1m
log global
option tcplog
default_backend smtp
backend smtp
mode tcp
log global
option tcplog
timeout server 1m
timeout connect 7s
server rdaf-smtp-1 192.168.108.127:8456
server rdaf-smtp-2 192.168.108.128:8456
my_custom.cnf
[mysqld]
transaction_isolation=READ-COMMITTED
binlog_format=ROW
#Logging
log_error = /opt/rdaf/log/mariadb.log
log_queries_not_using_indexes = 1
long_query_time = 5
slow_query_log = 0 # Disabled for production
slow_query_log_file = /opt/rdaf/log/mariadb-slow.log
#Log expiry
expire_logs_days=1
max_connections=256
max_connect_errors=1000000
connect_timeout=10
max_allowed_packet=128M
# The wait_timeout system variable sets the time in seconds that the
# server waits for an idle interactive connection to become active before closing it.
wait_timeout=720
interactive_timeout=720
net_read_timeout=300
net_write_timeout=300
idle_transaction_timeout=300
#InnoDB tables
innodb_buffer_pool_size=2G
#Log File should be .25 of Buffer pool Size.
innodb_log_file_size=1G
innodb_log_buffer_size=64M
innodb_file_per_table=1
innodb_flush_log_at_trx_commit=1
innodb_lock_wait_timeout=5
innodb_purge_threads=4
innodb_write_io_threads=8
innodb_read_io_threads=8
innodb_flush_method=O_DIRECT
innodb_max_dirty_pages_pct=10
innodb_max_dirty_pages_pct_lwm=5
innodb_io_capacity=600
innodb_print_all_deadlocks=ON
[galera]
wsrep_provider_options="evs.suspect_timeout=PT30S; evs.inactive_timeout=PT45S; evs.inactive_check_period=PT15S"
rdauser@infra108122:~$ python rdaf_upgrade_1411_1412.py upgrade
Creating backup of haproxy.cfg on 192.168.108.122
Updating haproxy configuration on 192.168.108.122
HAProxy configuration updated on 192.168.108.122
Creating backup of haproxy.cfg on 192.168.108.123
Updating haproxy configuration on 192.168.108.123
HAProxy configuration updated on 192.168.108.123
Creating backup of my_custom.cnf on 192.168.108.122
Updating MariaDB configuration on 192.168.108.122
MariaDB configuration updated on 192.168.108.122
Creating backup of my_custom.cnf on 192.168.108.123
Updating MariaDB configuration on 192.168.108.123
MariaDB configuration updated on 192.168.108.123
Creating backup of my_custom.cnf on 192.168.108.124
Updating MariaDB configuration on 192.168.108.124
MariaDB configuration updated on 192.168.108.124
rdauser@infra108122:~$
Note
If the customer environment is running under high load, they should reach out to the Fabrix.ai support team to further tune the parameters based on the workload. After receiving the my_custom.cnf and haproxy.cfg files, please follow the below steps.
Copy the new my_custom.cnf file on all MariaDB VMs
-
Log in to all the VMs where the MariaDB containers are running, one by one.
-
Navigate to the below directory.
- Copy the new
my_custom.cnffile into this directory.
Copy the new haproxy.cfg file on all HAProxy VMs
-
Log in to all the VMs where the HAProxy containers are running, one by one.
-
Navigate to the below directory.
-
Copy the new haproxy.cfg file into this directory.
Restart MariaDB Manually
- Restart MariaDB manually by using the below steps.
MariaDB nodes must be restarted manually, Follow the steps below to manually restart the MariaDB cluster
Step 1. Login to the CLI VM and run the below given command to get the MariaDB nodes order.
[mariadb]
datadir = 192.168.108.50/var/mysql,192.168.108.56/var/mysql,192.168.108.58/var/mysql
user = xxxxxxxxxxxxxx
password = xxxxxxxxxxx
host = 192.168.108.50,192.168.108.56,192.168.108.58
master_id = 0
Step 2. Log in to all the MariaDB nodes and identify the node with the bootstrap configuration. Navigate to the below-mentioned path to find the bootstrap information.
- Under the mariadb section, Please update.
rdauser@infra108122:~$ vi /opt/rdaf/rdaf.cfg
rdauser@infra108122:~$ vi /opt/rdaf/deployment-scripts/192.168.108.122/infra.yaml
mariadb:
image: 192.168.108.122:5000/rda-platform-mariadb:1.0.4
restart: 'no'
network_mode: host
mem_limit: 8G
memswap_limit: 8G
oom_kill_disable: false
volumes:
- /var/mysql:/bitnami/mariadb/data/
- /opt/rdaf/config/mariadb:/opt/bitnami/mariadb/conf/bitnami/
- /opt/rdaf/logs/mariadb:/opt/rdaf/log/
logging:
driver: json-file
options:
max-size: 10m
max-file: '5'
environment:
- MARIADB_GALERA_MARIABACKUP_USER=rdaf_backup
- MARIADB_GALERA_MARIABACKUP_PASSWORD=rdaf_backup
- MARIADB_GALERA_NODE_ADDRESS=192.168.108.122
- MARIADB_GALERA_NODE_NAME=192.168.108.122
- MARIADB_USER=rdaf
- MARIADB_PASSWORD=rdaf123!
- MARIADB_ROOT_PASSWORD=rdaf123!
- MARIADB_GALERA_CLUSTER_NAME=rdaf_galera
- MARIADB_REPLICATION_USER=rdaf_replica
- MARIADB_REPLICATION_PASSWORD=rdaf_replica
- MARIADB_GALERA_CLUSTER_ADDRESS=gcomm://192.168.108.122,192.168.108.123,192.168.108.124
- MARIADB_GALERA_CLUSTER_BOOTSTRAP=no
Step 3. The above Bootstrap information mentioned in Step 2, the MariaDB container running on that VM has to be restarted at the end. Restart MariaDB (Reverse Order - Start with Third Node), Login to the third node and restart the container.
rdauser@infra108124:~$ docker ps -a | grep maria
ac633bfab6c4 192.168.108.122:5000/rda-platform-mariadb:1.0.4 "/opt/bitnami/script…" 25 hours ago Up 25 hours infra-mariadb-1
rdauser@infra108124:~$ docker stop -t 120 ac633bfab6c4
ac633bfab6c4
rdauser@infra108124:~$ docker start ac633bfab6c4
ac633bfab6c4
rdauser@infra108124:~$ cd /opt/rdaf/logs/mariadb/
rdauser@infra108124:/opt/rdaf/logs/mariadb$ ls
auto-restart.log auto-restart.log.1.gz mariadb-slow.log mariadb-slow.log.1.gz mariadb.log mariadb.log.1.gz
rdauser@infra108124:/opt/rdaf/logs/mariadb$ grep WSREP *
mariadb.log:2026-04-24 13:12:41 0 [Note] WSREP: Member 1.0 (192.168.108.123) synced with group.
mariadb.log:2026-04-24 13:12:41 0 [Note] WSREP: Processing event queue:... 100.0% (1/1 events) complete.
mariadb.log:2026-04-24 13:12:41 0 [Note] WSREP: Shifting JOINED -> SYNCED (TO: 3060)
mariadb.log:2026-04-24 13:12:41 2 [Note] WSREP: Server 192.168.108.123 synced with group
mariadb.log:2026-04-24 13:12:41 2 [Note] WSREP: Server status change joined -> synced
mariadb.log:2026-04-24 13:12:41 2 [Note] WSREP: Synchronized with group, ready for connections
mariadb.log:2026-04-24 13:12:41 2 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification.
Step 4. Run the following MySQL commands to verify that the MariaDB node has successfully rejoined the Galera cluster.
rdauser@infra108123:/opt/rdaf/logs/mariadb$ mysql -h 192.168.108.123 -P 3306 -urdaf -prdaf123!
mysql: [Warning] Using a password on the command line interface can be insecure.
Welcome to the MySQL monitor. Commands end with ; or \g.
Your MySQL connection id is 59
Server version: 11.4.5-MariaDB-log Source distribution
Copyright (c) 2000, 2026, Oracle and/or its affiliates.
Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.
Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.
- Please use the below command to check if the node is connected to the cluster.
+-----------------+-------+
| Variable_name | Value |
+-----------------+-------+
| wsrep_connected | ON |
+-----------------+-------+
1 row in set (0.00 sec)
- Use the command given below to verify the node is ready to accept queries.
+---------------+-------+
| Variable_name | Value |
+---------------+-------+
| wsrep_ready | ON |
+---------------+-------+
1 row in set (0.00 sec)
- Please use the below command to check the local node sync state.
+---------------------------+--------+
| Variable_name | Value |
+---------------------------+--------+
| wsrep_local_state_comment | Synced |
+---------------------------+--------+
1 row in set (0.01 sec)
- Use the command given below to confirm all three nodes are part of the cluster.
+--------------------+-------+
| Variable_name | Value |
+--------------------+-------+
| wsrep_cluster_size | 3 |
+--------------------+-------+
1 row in set (0.01 sec)
- Please use the below command to verify the cluster is in Primary state.
+----------------------+---------+
| Variable_name | Value |
+----------------------+---------+
| wsrep_cluster_status | Primary |
+----------------------+---------+
1 row in set (0.00 sec)
- Use the command given below to check the local node index within the cluster.
+-------------------+-------+
| Variable_name | Value |
+-------------------+-------+
| wsrep_local_index | 1 |
+-------------------+-------+
1 row in set (0.00 sec)
- Please use the below command to verify all node addresses are listed.
+--------------------------+----------------------------------------------------------+
| Variable_name | Value |
+--------------------------+----------------------------------------------------------+
| wsrep_incoming_addresses | 192.168.108.122:3306,192.168.108.123:3306,192.168.108.124:3306 |
+--------------------------+----------------------------------------------------------+
1 row in set (0.00 sec)
mysql> exit
Bye
Step 5. Log in to the second node and restart the MariaDB container.
- After restarting the container, use the tail command to monitor the MariaDB logs for the highlighted line.
rdauser@infra108124:~$ docker ps -a | grep maria
ac633bfab6c4 192.168.108.122:5000/rda-platform-mariadb:1.0.4 "/opt/bitnami/script…" 25 hours ago Up 25 hours infra-mariadb-1
rdauser@infra108124:~$ docker stop -t 120 ac633bfab6c4
ac633bfab6c4
rdauser@infra108124:~$ docker start ac633bfab6c4
ac633bfab6c4
rdauser@infra108124:~$ cd /opt/rdaf/logs/mariadb/
rdauser@infra108124:/opt/rdaf/logs/mariadb$ ls
auto-restart.log auto-restart.log.1.gz mariadb-slow.log mariadb-slow.log.1.gz mariadb.log mariadb.log.1.gz
rdauser@infra108124:/opt/rdaf/logs/mariadb$ grep WSREP *
mariadb.log:2026-04-24 13:12:41 0 [Note] WSREP: Member 1.0 (192.168.108.123) synced with group.
mariadb.log:2026-04-24 13:12:41 0 [Note] WSREP: Processing event queue:... 100.0% (1/1 events) complete.
mariadb.log:2026-04-24 13:12:41 0 [Note] WSREP: Shifting JOINED -> SYNCED (TO: 3060)
mariadb.log:2026-04-24 13:12:41 2 [Note] WSREP: Server 192.168.108.123 synced with group
mariadb.log:2026-04-24 13:12:41 2 [Note] WSREP: Server status change joined -> synced
mariadb.log:2026-04-24 13:12:41 2 [Note] WSREP: Synchronized with group, ready for connections
mariadb.log:2026-04-24 13:12:41 2 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification.
- Once the container is restarted, run the MySQL verification commands from Step 4 to confirm the node has rejoined the cluster.
Step 6. Restart MariaDB — First Node (bootstrap node)
-
Log in to the first MariaDB node and stop the MariaDB container, first navigate to the
infra.yamlfile pathcd /opt/rdaf/deployment-scripts/<Host IP>/ -
To stop the MariaDB container use the below command.
- To start the MariaDB container use the below command.
rdauser@infra108122:~$ cd /opt/rdaf/deployment-scripts/192.168.108.122/
rdauser@infra108122:/opt/rdaf/deployment-scripts/192.168.108.122$ ls
infra.yaml
rdauser@infra108122:/opt/rdaf/deployment-scripts/192.168.108.122$ vi infra.yaml
rdauser@infra108122:/opt/rdaf/deployment-scripts/192.168.108.122$ vi infra.yaml
rdauser@infra108122:/opt/rdaf/deployment-scripts/192.168.108.122$ docker-compose -f infra.yaml --project-name infra rm -fsv mariadb
[+] Stopping 1/1
✔ Container infra-mariadb-1 Stopped 0.6s
rdauser@infra108122:/opt/rdaf/deployment-scripts/192.168.108.122$ docker-compose -f infra.yaml --project-name infra up -d mariadb
[+] Running 1/1
✔ Container infra-mariadb-1 Started
rdauser@infra108122:/opt/rdaf/logs/mariadb$ grep WSREP *
mariadb.log:2026-04-24 13:12:41 0 [Note] WSREP: Member 1.0 (192.168.108.123) synced with group.
mariadb.log:2026-04-24 13:12:41 0 [Note] WSREP: Processing event queue:... 100.0% (1/1 events) complete.
mariadb.log:2026-04-24 13:12:41 0 [Note] WSREP: Shifting JOINED -> SYNCED (TO: 3060)
mariadb.log:2026-04-24 13:12:41 2 [Note] WSREP: Server 192.168.108.123 synced with group
mariadb.log:2026-04-24 13:12:41 2 [Note] WSREP: Server status change joined -> synced
mariadb.log:2026-04-24 13:12:41 2 [Note] WSREP: Synchronized with group, ready for connections
mariadb.log:2026-04-24 13:12:41 2 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification.
Step 7. Verify All Nodes Are Synced
- Once the first node is up, run the MySQL verification commands from Step 4 to confirm all nodes have rejoined and are in sync.
Step 9. Restore Bootstrap Configuration on First Node
-
Navigate to
/opt/rdaf/deployment-scripts/192.168.108.122/infra.yamland update like below in the MariaDB section. -
Under the mariadb section, Please update.
From
To
mariadb:
image: 192.168.108.122:5000/rda-platform-mariadb:1.0.4
restart: 'no'
network_mode: host
mem_limit: 8G
memswap_limit: 8G
oom_kill_disable: false
volumes:
- /var/mysql:/bitnami/mariadb/data/
- /opt/rdaf/config/mariadb:/opt/bitnami/mariadb/conf/bitnami/
- /opt/rdaf/logs/mariadb:/opt/rdaf/log/
logging:
driver: json-file
options:
max-size: 10m
max-file: '5'
environment:
- MARIADB_GALERA_MARIABACKUP_USER=rdaf_backup
- MARIADB_GALERA_MARIABACKUP_PASSWORD=rdaf_backup
- MARIADB_GALERA_NODE_ADDRESS=192.168.108.122
- MARIADB_GALERA_NODE_NAME=192.168.108.122
- MARIADB_USER=rdaf
- MARIADB_PASSWORD=rdaf123!
- MARIADB_ROOT_PASSWORD=rdaf123!
- MARIADB_GALERA_CLUSTER_NAME=rdaf_galera
- MARIADB_REPLICATION_USER=rdaf_replica
- MARIADB_REPLICATION_PASSWORD=rdaf_replica
- MARIADB_GALERA_CLUSTER_ADDRESS=gcomm://192.168.108.122,192.168.108.123,192.168.108.124
- MARIADB_GALERA_CLUSTER_BOOTSTRAP=yes
- Restart HAProxy Containers
Step 10. Identify HAProxy VMs
First, determine which VMs are running the HAProxy instance and run the following command
Check the IP addresses where HAProxy is running.
Step 11. Login to HAProxy VMs and identify the Virtual IP
Now, login to the VM’s that are running haproxy and run the below given command.
From the output, identify the Virtual IP (VIP) configured on the system. see below the highlighted line for VirtualIP in the example output
rdauser@infra108122:~$ ip addr show
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
inet6 ::1/128 scope host noprefixroute
valid_lft forever preferred_lft forever
2: ens160: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000
link/ether 00:50:56:93:36:53 brd ff:ff:ff:ff:ff:ff
altname enp3s0
inet 192.168.108.122/24 brd 192.168.108.255 scope global ens160
valid_lft forever preferred_lft forever
inet 192.168.108.129/24 scope global secondary ens160
valid_lft forever preferred_lft forever
inet6 fe80::250:56ff:fe93:3653/64 scope link
valid_lft forever preferred_lft forever
Note
In an HA environment, the VM where the Virtual IP (VIP) is present should be restarted last.
Restart HA Proxy Containers in all the VM's where they are deployed.
Step 12. Restart HAProxy containers one by one on each VM using below given dockercommands
rdauser@infra108122:~$ docker ps -a | grep hap
9567576aa14c 192.168.108.122:5000/rda-platform-haproxy:1.0.4 "/docker-entry-point…" 26 hours ago Up 4 hours infra-haproxy-1
rdauser@infra108122:~$ docker stop -t 120 9567576aa14c
9567576aa14c
rdauser@infra108122:~$ docker start 9567576aa14c
9567576aa14c
rdauser@infra108122:~$
Step 13. Please verify HAProxy is healthy by using the command given below, all HAProxy and keepalived entries should show a running/active state
| haproxy | 192.168.108.122 | Up 2 days | 9567576aa14c | 1.0.4 |
| haproxy | 192.168.108.123 | Up 2 days | 6a0001f08b82 | 1.0.4 |
| keepalived | 192.168.108.122 | active | N/A | N/A |
| keepalived | 192.168.108.123 | active | N/A | N/A |
Step 14. Run the healthcheck using the below given command, all checks should return OK
| haproxy | Port Connection | OK | N/A | 192.168.108.122 | 9567576aa14c |
| haproxy | Service Status | OK | N/A | 192.168.108.122 | 9567576aa14c |
| haproxy | Firewall Port | OK | N/A | 192.168.108.122 | 9567576aa14c |
| haproxy | Port Connection | OK | N/A | 192.168.108.123 | 6a0001f08b82 |
| haproxy | Service Status | OK | N/A | 192.168.108.123 | 6a0001f08b82 |
| haproxy | Firewall Port | OK | N/A | 192.168.108.123 | 6a0001f08b82 |
| keepalived | Service Status | OK | N/A | 192.168.108.122 | N/A |
| keepalived | Service Status | OK | N/A | 192.168.108.123 | N/A
1.2.6 RDA Studio Upgrade
Please navigate to the rda-studio.yml file. You need to modify the existing tag version to 8.1.0.6, ensuring it matches the format shown in the example below, and then save the file
services:
cfxdx:
image: docker1.cloudfabrix.io:443/external/ubuntu-cfxdx-nb-nginx-all:8.1.0.6
restart: unless-stopped
volumes:
- /opt/rdaf/cfxdx/home/:/root
- /opt/rdaf/cfxdx/config/:/tmp/config/
- /opt/rdaf/cfxdx/output:/tmp/output/
- /opt/rdaf/config/network_config/:/network_config
ports:
- "9998:9998"
environment:
#JUPYTER_TOKEN: cfxdxdemo
NLTK_DATA : "/root/nltk_data"
CFXDX_CONFIG_FILE: /tmp/config/conf.yml
RDA_NETWORK_CONFIG: /network_config/config.json
RDA_USER: xxxxxxx
RDA_PASSWORD: xxxxxxxxxxxx
After updating the rda-studio.yml file to set the tag version to 8.1.0.6, execute the following commands to pull the latest images and start the services