Skip to content

Upgrade from 8.1.0.x to 8.1.0.6

1. Upgrade From 8.1.0.x to 8.1.0.6

RDAF Platform: From 8.1.0.x to 8.1.0.6(Selected Services)

RDAF Deployment rdaf CLI: From 1.4.1 to 1.4.1.2

OIA (AIOps) Application: From 8.1.0.x to 8.1.0.6(Selected Services)

RDAF Studio: From 8.1.0.x to 8.1.0.6

1.1. Prerequisites

Before proceeding with this upgrade, please make sure and verify the below prerequisites are met.

  • RDAF Deployment CLI version: 1.4.1

  • Infra Services tag: 1.0.4

  • Platform Services and RDA Worker tag: 8.1.0.x

  • OIA Application Services tag: 8.1.0.x

Note

  • Check the Disk space of all the Platform and Service Vm's using the below mentioned command, the highlighted disk size should be less than 80%
    df -kh
    
rdauser@oia-125-216:~/collab-3.7-upgrade$ df -kh
Filesystem                         Size  Used Avail Use% Mounted on
udev                                32G     0   32G   0% /dev
tmpfs                              6.3G  357M  6.0G   6% /run
/dev/mapper/ubuntu--vg-ubuntu--lv   48G   12G   34G  26% /
tmpfs                               32G     0   32G   0% /dev/shm
tmpfs                              5.0M     0  5.0M   0% /run/lock
tmpfs                               32G     0   32G   0% /sys/fs/cgroup
/dev/loop0                          64M   64M     0 100% /snap/core20/2318
/dev/loop2                          92M   92M     0 100% /snap/lxd/24061
/dev/sda2                          1.5G  309M  1.1G  23% /boot
/dev/sdf                            50G  3.8G   47G   8% /var/mysql
/dev/loop3                          39M   39M     0 100% /snap/snapd/21759
/dev/sdg                            50G  541M   50G   2% /minio-data
/dev/loop4                          92M   92M     0 100% /snap/lxd/29619
/dev/loop5                          39M   39M     0 100% /snap/snapd/21465
/dev/sde                            15G  140M   15G   1% /zookeeper
/dev/sdd                            30G  884M   30G   3% /kafka-logs
/dev/sdc                            50G  3.3G   47G   7% /opt
/dev/sdb                            50G   29G   22G  57% /var/lib/docker
/dev/sdi                            25G  294M   25G   2% /graphdb
/dev/sdh                            50G   34G   17G  68% /opensearch
/dev/loop6                          64M   64M     0 100% /snap/core20/2379

Warning

Make sure all of the above pre-requisites are met before proceeding with the upgrade process.

Warning

Non-Kubernetes: Upgrading RDAF Platform and AIOps application services is a disruptive operation. Schedule a maintenance window before upgrading RDAF Platform and AIOps services to newer version.

Important

Please make sure full backup of the RDAF platform system is completed before performing the upgrade.

Non-Kubernetes: Please run the below backup command to take the backup of application data.

rdaf backup --dest-dir <backup-dir>
Note: Please make sure this backup-dir is mounted across all infra,cli vms.

  • Verify that RDAF deployment rdaf cli version is 1.4.1 on the VM where CLI was installed for docker on-prem registry managing Non-kubernetes deployments.
rdaf --version
RDAF CLI version: 1.4.1
  • On-premise docker registry service version is 1.0.3
docker ps | grep docker-registry
ff6b1de8515f   cfxregistry.CloudFabrix.io:443/docker-registry:1.0.3   "/entrypoint.sh /bin…"   7 days ago   Up 7 days             deployment-scripts-docker-registry-1
  • RDAF Platform services version is 8.1.0.x

Run the below command to get RDAF Platform services details

rdaf platform status
  • RDAF OIA Application services version is 8.1.0.x

Run the below command to get RDAF App services details

rdaf app status

1.1.1 RDAF Deployment CLI Upgrade

  • Download the RDAF Deployment CLI's newer version 1.4.1.2 bundle
wget https://macaw-amer.s3.us-east-1.amazonaws.com/releases/rdaf-platform/1.4.1.2/rdafcli-1.4.1.2.tar.gz 
  • Upgrade the rdaf CLI to version 1.4.1.2
pip install --user rdafcli-1.4.1.2.tar.gz
  • Verify the installed rdaf CLI version is upgraded to 1.4.1.2
rdaf --version
RDAF CLI version: 1.4.1.2
  • Download the RDAF Deployment CLI's newer version 1.4.1.2 bundle and copy it to RDAF management VM on which rdaf & rdafk8s deployment CLI was installed.
wget https://macaw-amer.s3.us-east-1.amazonaws.com/releases/rdaf-platform/1.4.1.2/offline-ubuntu-1.4.1.2.tar.gz
  • Extract the rdaf CLI software bundle contents
tar -xvzf offline-ubuntu-1.4.1.2.tar.gz
  • Change the directory to the extracted directory
cd offline-ubuntu-1.4.1.2
  • Upgrade the rdafCLI to version 1.4.1.2
pip install --user rdafcli-1.4.1.2.tar.gz -f ./ --no-index
  • Verify the installed rdaf CLI version
rdaf --version
RDAF CLI version: 1.4.1.2

1.1.2 Pre-Upgrade: MariaDB Health Check and FSM Index Setup

  • Check MariaDB Memory Capacity

Get the current memory capacity allocated to MariaDB. Go to each node where MariaDB is deployed and run the below given docker command.

docker stats --no-stream | grep mariadb
rdauser@infra108122:~$ docker stats --no-stream | grep mariadb
afb5b248bc1d   infra-mariadb-1   0.15%   320.9MiB / 8GiB   3.92%   0B / 0B   15.5MB / 579MB   30

Note

By default, the memory allocated to MariaDB is 8GB. If the environment is managing a higher load, it is recommended to increase the MariaDB container memory to 16GB or higher.

  • Create Single Column Index in FSM Transition Table

Get the current number of rows and create a single column index in the FSM transition table.

Login to the MariaDB running instance using the below command from the CLI VM.

mysql -u rdaf -prdaf123! -h <virtualip> -P 3307
rdauser@infra108122:~$ mysql -u rdaf -prdaf123! -h <virtualip> -P 3307
mysql: [Warning] Using a password on the command line interface can be insecure.
Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 49622
Server version: 11.4.5-MariaDB-log Source distribution

Once logged in, verify the available databases

mysql> show databases;
+-----------------------------------------------------------------+
| Database                                                        |
+-----------------------------------------------------------------+
| 7280ba4c39af4e068598875c5f01fbe3_alert_processor                |
| 7280ba4c39af4e068598875c5f01fbe3_cfx_app_controller             |
| 7280ba4c39af4e068598875c5f01fbe3_cfxdimensions_app_access_manag |
| 7280ba4c39af4e068598875c5f01fbe3_cfxdimensions_app_collaboratio |
| 7280ba4c39af4e068598875c5f01fbe3_cfxdimensions_app_file_browser |
| 7280ba4c39af4e068598875c5f01fbe3_cfxdimensions_app_irm_service  |
| 7280ba4c39af4e068598875c5f01fbe3_cfxdimensions_app_notification |
| 7280ba4c39af4e068598875c5f01fbe3_cfxdimensions_app_resource_man |
| 7280ba4c39af4e068598875c5f01fbe3_configuration_service          |
| 7280ba4c39af4e068598875c5f01fbe3_fsm                            |
| 7280ba4c39af4e068598875c5f01fbe3_identity                       |
| 7280ba4c39af4e068598875c5f01fbe3_ml_config                      |
| 7280ba4c39af4e068598875c5f01fbe3_rda                            |
| 7280ba4c39af4e068598875c5f01fbe3_services_state                 |
| 7280ba4c39af4e068598875c5f01fbe3_user_preferences               |
| information_schema                                              |
| mysql                                                           |
| performance_schema                                              |
| saasportal                                                      |
| sys                                                             |
| test                                                            |
+-----------------------------------------------------------------+
21 rows in set (0.00 sec)

Switch to the FSM database

mysql> use 7280ba4c39af4e068598875c5f01fbe3_fsm;
Reading table information for completion of table and column names
You can turn off this feature to get a quicker startup with -A
Database changed

Verify the tables available in the FSM database

mysql> show tables;
+------------------------------------------------+
| Tables_in_7280ba4c39af4e068598875c5f01fbe3_fsm |
+------------------------------------------------+
| timer                                          |
| transition                                     |
+------------------------------------------------+
2 rows in set (0.00 sec)

Check the current number of rows in the transition table using the command given below

mysql> select count(*) from transition;
+------------+
| count(*)   |
+------------+
|  2000      |
+------------+
1 row in set (0.08 sec)

Note

If the number of rows is equal to or greater than 1 lakh(1,00,000), please contact the Fabrix.ai support team to create indexes on the above table.

Create the required indexes on the transition table using the below queries.

mysql> CREATE INDEX idx_uuid_solo ON transition(uuid);       
Query OK, 0 rows affected (0.01 sec)
Records: 0  Duplicates: 0  Warnings: 0
Query OK, 0 rows affected, 1 warning (0.01 sec)
Records: 0  Duplicates: 0  Warnings: 1

Verify the indexes have been created successfully using the command given below.

mysql> SHOW INDEX FROM transition;
+------------+------------+---------------------------------------------------+--------------+---------------------------+-----------+-------------+----------+--------+------+------------+---------+---------------+---------+
| Table      | Non_unique | Key_name                                          | Seq_in_index | Column_name               | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment | Ignored |
+------------+------------+---------------------------------------------------+--------------+---------------------------+-----------+-------------+----------+--------+------+------------+---------+---------------+---------+
| transition |          0 | PRIMARY                                           |            1 | uuid                      | A         |       85722 |     NULL | NULL   |      | BTREE      |         |               | NO      |
| transition |          0 | PRIMARY                                           |            2 | current_state             | A         |      171445 |     NULL | NULL   |      | BTREE      |         |               | NO      |
| transition |          1 | idx_uuid_is_final_state_last_transition_timestamp |            1 | uuid                      | A         |      171445 |     NULL | NULL   |      | BTREE      |         |               | NO      |
| transition |          1 | idx_uuid_is_final_state_last_transition_timestamp |            2 | is_final_state            | A         |      171445 |     NULL | NULL   |      | BTREE      |         |               | NO      |
| transition |          1 | idx_uuid_is_final_state_last_transition_timestamp |            3 | last_transition_timestamp | A         |      171445 |     NULL | NULL   | YES  | BTREE      |         |               | NO      |
| transition |          1 | idx_finalstate_ts_uuid                            |            1 | is_final_state            | A         |           1 |     NULL | NULL   |      | BTREE      |         |               | NO      |
| transition |          1 | idx_finalstate_ts_uuid                            |            2 | last_transition_timestamp | A         |       85722 |     NULL | NULL   | YES  | BTREE      |         |               | NO      |
| transition |          1 | idx_finalstate_ts_uuid                            |            3 | uuid                      | A         |      171445 |     NULL | NULL   |      | BTREE      |         |               | NO      |
| transition |          1 | idx_uuid_solo                                     |            1 | uuid                      | A         |      171445 |     NULL | NULL   |      | BTREE      |         |               | NO      |
+------------+------------+---------------------------------------------------+--------------+---------------------------+-----------+-------------+----------+--------+------+------------+---------+---------------+---------+
9 rows in set (0.00 sec)

Once the indexes are verified, exit the MariaDB session

mysql> exit
Bye

1.2. Upgrade Steps

1.2.1 Download the new Docker Images

Login into the VM where rdaf deployment CLI was installed for docker on-premise registry and managing Non-kubernetes deployment.

Download the new docker image tags for RDAF Platform and OIA (AIOps) Application services and wait until all of the images are downloaded.

To fetch registry please use the below command

rdaf registry fetch --tag 8.1.0.6

Note

If the Download of the images fail, Please re-execute the above command

Run the below command to verify above mentioned tags are downloaded for all of the RDAF Platform and OIA (AIOps) Application services.

rdaf registry list-tags 

Please make sure 8.1.0.6 image tag is downloaded for the below RDAF Platform services.

  • rda-api-server
  • rda-scheduler
  • rda-collector
  • rda-fsm
  • cfx-rda-access-manager
  • portal-backend
  • portal-frontend
  • rda-worker-all
  • rda-studio

Please make sure 8.1.0.6 image tag is downloaded for the below RDAF OIA (AIOps) Application services.

  • cfx-rda-app-controller
  • cfx-rda-alert-processor
  • cfx-rda-irm-service

Downloaded Docker images are stored under the below path.

/opt/rdaf-registry/data/docker/registry/v2/ or /opt/rdaf/data/docker/registry/v2/

Run the below command to check the filesystem's disk usage on offline registry VM where docker images are pulled.

df -h /opt

If necessary, older image tags that are no longer in use can be deleted to free up disk space using the command below.

Note

Run the command below if /opt occupies more than 80% of the disk space or if the free capacity of /opt is less than 25GB.

rdaf registry delete-images --tag <tag1,tag2>

1.2.2 Upgrade RDAF Platform Services

Warning

For Non-Kubernetes deployment, upgrading RDAF Platform and AIOps application services is a disruptive operation when rolling-upgrade option is not used. Please schedule a maintenance window before upgrading RDAF Platform and AIOps services to newer version.

Run the below command to initiate upgrading RDAF Platform services with zero downtime

rdaf platform upgrade --tag 8.1.0.6 --service rda_api_server --service rda_scheduler --service rda_collector --service rda_fsm --service cfx-rda-access-manager --service portal-backend --service portal-frontend --rolling-upgrade --timeout 10

Note

timeout <10> mentioned in the above command represents as Seconds

Note

The rolling-upgrade option upgrades the Platform services running in high-availability mode on one VM at a time in sequence. It completes the upgrade of Platform services running on VM-1 before upgrading them on VM-2, followed by VM-3, and so on.

During this upgrade sequence, RDAF platform continues to function without any impact to the application traffic.

After completing the Platform services upgrade on all VMs, it will ask for user confirmation to delete the older version Platform service PODs. The user has to provide YES to delete the old docker containers (in non-k8s)

192.168.108.122:5000/ubuntu-rda-client-api-server:8.1.0.6

2025-10-30 10:32:43,693 [rdaf.component.platform] INFO     - Gathering platform container details.
2025-10-30 10:32:44,246 [rdaf.component.platform] INFO     - Gathering rdac pod details.
+----------+------------+---------+---------+--------------+-------------+------------+
| Pod ID   | Pod Type   | Version | Age     | Hostname     | Maintenance | Pod Status |
+----------+------------+---------+---------+--------------+-------------+------------+
| 7a4a239a | collector  | 8.1.0.x | 4:02:59 | a46780fed2d5 | None        | True       |
| fc561b34 | asm        | 8.1.0.x | 4:01:55 | bf6e4f48aa1e | None        | True       |
| 16bfb91a | api-server | 8.1.0.x | 4:04:33 | 51faaac76f07 | None        | True       |
+----------+------------+---------+---------+--------------+-------------+------------+
Continue moving above pods to maintenance mode? [yes/no]: yes
2025-10-30 10:33:00,533 [rdaf.component.platform] INFO     - Initiating Maintenance Mode...
2025-10-30 10:33:25,621 [rdaf.component.platform] INFO     - Following container are in maintenance mode
+----------+------------+---------+---------+--------------+-------------+------------+
| Pod ID   | Pod Type   | Version | Age     | Hostname     | Maintenance | Pod Status |
+----------+------------+---------+---------+--------------+-------------+------------+
| 16bfb91a | api-server | 8.1.0.x | 4:05:08 | 51faaac76f07 | maintenance | False      |
| fc561b34 | asm        | 8.1.0.x | 4:02:30 | bf6e4f48aa1e | maintenance | False      |
| 7a4a239a | collector  | 8.1.0.x | 4:03:34 | a46780fed2d5 | maintenance | False      |
+----------+------------+---------+---------+--------------+-------------+------------+
2025-10-30 10:33:25,622 [rdaf.component.platform] INFO     - Waiting for timeout of 2 seconds...
2025-10-30 10:33:27,622 [rdaf.component.platform] INFO     - Upgrading service: rda_collector on host 192.168.

Run the below command to initiate upgrading RDAF Platform services without zero downtime

rdaf platform upgrade --tag 8.1.0.6

Please wait till all of the new platform services are in Up state and run the below command to verify their status and make sure all of them are running with 8.1.0.6 version.

rdaf platform status
+--------------------------+----------------+-------------------------------+--------------+---------+
| Name                     | Host           | Status                        | Container Id | Tag     |
+--------------------------+----------------+-------------------------------+--------------+---------+
| rda_api_server           | 192.168.108.51 | Up 4 hours                    | dc2dd806e6a6 | 8.1.0.6 |
| rda_api_server           | 192.168.108.52 | Up 4 hours                    | a76257df0330 | 8.1.0.6 |
| rda_registry             | 192.168.108.51 | Up 4 hours                    | f23455c6b85b | 8.1.0.1 |
| rda_registry             | 192.168.108.52 | Up 4 hours                    | 3b8deb15ad1f | 8.1.0.1 |
| rda_scheduler            | 192.168.108.51 | Up 4 hours                    | 1864f7e88bfb | 8.1.0.6 |
| rda_scheduler            | 192.168.108.52 | Up 4 hours                    | 62089081e902 | 8.1.0.6 |
| rda_collector            | 192.168.108.51 | Up 4 hours                    | 50c81f436fd9 | 8.1.0.6 |
| rda_collector            | 192.168.108.52 | Up 4 hours                    | 754db49f2804 | 8.1.0.6 |
| rda_identity             | 192.168.108.51 | Up 4 hours                    | 37625fde83e8 | 8.1.0.1 |
| rda_identity             | 192.168.108.52 | Up 4 hours                    | bb60423a47fa | 8.1.0.1 |
| rda_asm                  | 192.168.108.51 | Up 4 hours                    | 5ae15e7d661e | 8.1.0.1 |
| rda_asm                  | 192.168.108.52 | Up 4 hours                    | 80181bb0f80e | 8.1.0.1 |
| rda_fsm                  | 192.168.108.51 | Up 4 hours                    | bfaf7206eacb | 8.1.0.6 |
| rda_fsm                  | 192.168.108.52 | Up 4 hours                    | 8c470b9d7b08 | 8.1.0.6 |
+--------------------------+----------------+-------------------------------+--------------+---------+

Run the below command to check the rda-scheduler service is elected as a leader under Site column.

rdac pods

Run the below command to check if all services has ok status and does not throw any failure messages.

rdac healthcheck
+-----------+----------------------------------------+--------------+----------+-------------+-----------------------------------------------------+----------+-------------------------------------------------------------+
| Cat       | Pod-Type                               | Host         | ID       | Site        | Health Parameter                                    | Status   | Message                                                     |
|-----------+----------------------------------------+--------------+----------+-------------+-----------------------------------------------------+----------+-------------------------------------------------------------|
| rda_app   | alert-ingester                         | 7f75047e9e44 | daa8c414 |             | service-status                                      | ok       |                                                             |
| rda_app   | alert-ingester                         | 7f75047e9e44 | daa8c414 |             | minio-connectivity                                  | ok       |                                                             |
| rda_app   | alert-ingester                         | 7f75047e9e44 | daa8c414 |             | service-dependency:configuration-service            | ok       | 2 pod(s) found for configuration-service                    |
| rda_app   | alert-ingester                         | 7f75047e9e44 | daa8c414 |             | service-initialization-status                       | ok       |                                                             |
| rda_app   | alert-ingester                         | 7f75047e9e44 | daa8c414 |             | kafka-connectivity                                  | ok       | Cluster=NTc1NWU1MTQxYmY3MTFlZg, Broker=1, Brokers=[1, 2, 3] |
| rda_app   | alert-ingester                         | f9ec55862be0 | f9b9231c |             | service-status                                      | ok       |                                                             |
| rda_app   | alert-ingester                         | f9ec55862be0 | f9b9231c |             | minio-connectivity                                  | ok       |                                                             |
| rda_app   | alert-ingester                         | f9ec55862be0 | f9b9231c |             | service-dependency:configuration-service            | ok       | 2 pod(s) found for configuration-service                    |
| rda_app   | alert-ingester                         | f9ec55862be0 | f9b9231c |             | service-initialization-status                       | ok       |                                                             |
| rda_app   | alert-ingester                         | f9ec55862be0 | f9b9231c |             | kafka-connectivity                                  | ok       | Cluster=NTc1NWU1MTQxYmY3MTFlZg, Broker=3, Brokers=[1, 2, 3] |
| rda_app   | alert-processor                        | c6cc7b04ab33 | b4ebfb06 |             | service-status                                      | ok       |                                                             |
| rda_app   | alert-processor                        | c6cc7b04ab33 | b4ebfb06 |             | minio-connectivity                                  | ok       |                                                             |
+-----------+----------------------------------------+--------------+----------+-------------+-----------------------------------------------------+----------+-------------------------------------------------------------+

1.2.3 Upgrade RDA Worker Services

Note

If the worker was deployed in a HTTP proxy environment, please make sure the required HTTP proxy environment variables are added in /opt/rdaf/deployment-scripts/values.yaml file under rda_worker configuration section as shown below before upgrading RDA Worker services.

rda_worker:
  mem_limit: 8G
  memswap_limit: 8G
  privileged: false
  environment:
    RDA_ENABLE_TRACES: 'no'
    RDA_SELF_HEALTH_RESTART_AFTER_FAILURES: 3
    http_proxy:  "http://test:1234@192.168.122.107:3128"
    https_proxy: "http://test:1234@192.168.122.107:3128"
    HTTP_PROXY:  "http://test:1234@192.168.122.107:3128"
    HTTPS_PROXY: "http://test:1234@192.168.122.107:3128"
  • Upgrade RDA Worker Services

Please run the below command to initiate upgrading the RDA Worker Service with zero downtime

rdaf worker upgrade --tag 8.1.0.6 --rolling-upgrade --timeout 10

Note

If the worker is deployed in a proxy environment, add the required environment proxy variables in /opt/rdaf/deployment-scripts/values.yaml, under the section rda_worker -> env:, instead of making changes to worker.yaml (Recommended only if there are any new changes needed for the worker)

Note

timeout <10> mentioned in the above command represents as seconds

Note

The rolling-upgrade option upgrades the Worker services running in high-availability mode on one VM at a time in sequence. It completes the upgrade of Worker services running on VM-1 before upgrading them on VM-2, followed by VM-3, and so on.

After completing the Worker services upgrade on all VMs, it will ask for user confirmation, the user has to provide YES to delete the older version Worker service PODs.

Digest: sha256:728962901928f166dfc1a3d5d7ad931c133621d1abac598e140af3249905836c
Status: Downloaded newer image for 192.168.108.122:5000/ubuntu-rda-worker-all:8.1.0.6
192.168.108.122:5000/ubuntu-rda-worker-all:8.1.0.6

2025-10-30 10:52:34,199 [rdaf.component.worker] INFO     - Collecting worker details for rolling upgrade
2025-10-30 10:52:45,508 [rdaf.component.worker] INFO     - Rolling upgrade worker on 192.168.108.127
+----------+----------+---------+---------+--------------+-------------+------------+
| Pod ID   | Pod Type | Version | Age     | Hostname     | Maintenance | Pod Status |
+----------+----------+---------+---------+--------------+-------------+------------+
| c2544331 | worker   | 8.1.0.3 | 4:10:38 | 5fac8ebe85fe | None        | True       |
+----------+----------+---------+---------+--------------+-------------+------------+
Continue moving above pod to maintenance mode? [yes/no]: yes
2025-10-30 10:54:30,732 [rdaf.component.worker] INFO     - Initiating maintenance mode for pod c2544331
2025-10-30 10:54:52,747 [rdaf.component.worker] INFO     - Following worker container is in maintenance mode
+----------+----------+---------+---------+--------------+-------------+------------+
| Pod ID   | Pod Type | Version | Age     | Hostname     | Maintenance | Pod Status |
+----------+----------+---------+---------+--------------+-------------+------------+
| c2544331 | worker   | 8.1.0.3 | 4:12:43 | 5fac8ebe85fe | maintenance | False      

Please run the below command to initiate upgrading the RDA Worker Service without zero downtime

rdaf worker upgrade --tag 8.1.0.6

Please wait for 120 seconds to let the newer version of RDA Worker service containers join the RDA Fabric appropriately. Run the below commands to verify the status of the newer RDA Worker service containers.

rdac pods | grep worker
| Infra | worker      | True        | 6eff605e72c4 | a318f394 | rda-site-01 | 13:45:13 |      4 |        31.21 | 0             | 0            |
| Infra | worker      | True        | ae7244d0d10a | 554c2cd8 | rda-site-01 | 13:40:40 |      4 |        31.21 | 0             | 0            |

rdaf worker status

+------------+----------------+------------+--------------+---------+
| Name       | Host           | Status     | Container Id | Tag     |
+------------+----------------+------------+--------------+---------+
| rda_worker | 192.168.108.53 | Up 4 hours | ea187f89505f | 8.1.0.6 |
| rda_worker | 192.168.108.54 | Up 4 hours | a62b3230bbaa | 8.1.0.6 |
+------------+----------------+------------+--------------+---------+
Run the below command to check if all RDA Worker services has ok status and does not throw any failure messages.

rdac healthcheck
+-----------+----------------------------------------+--------------+----------+-------------+-----------------------------------------------------+----------+-----------------------------------------------------------------------------------------------------------------------------+
| Cat       | Pod-Type                               | Host         | ID       | Site        | Health Parameter                                    | Status   | Message                                                                                                                     |
|-----------+----------------------------------------+--------------+----------+-------------+-----------------------------------------------------+----------+-----------------------------------------------------------------------------------------------------------------------------|
| rda_infra | api-server                             | 1b0542719618 | 1845ae67 |             | service-status                                      | ok       |                                                                                                                             |
| rda_infra | api-server                             | 1b0542719618 | 1845ae67 |             | minio-connectivity                                  | ok       |                                                                                                                             |
| rda_infra | api-server                             | d4404cffdc7a | a4cfdc6d |             | service-status                                      | ok       |                                                                                                                             |
| rda_infra | api-server                             | d4404cffdc7a | a4cfdc6d |             | minio-connectivity                                  | ok       |                                                                                                                             |
| rda_infra | asm                                    | 8d3d52a7a475 | 418c9dc1 |             | service-status                                      | ok       |                                                                                                                             |
| rda_infra | asm                                    | 8d3d52a7a475 | 418c9dc1 |             | minio-connectivity                                  | ok       |                                                                                                                             |
| rda_infra | asm                                    | ab172a9b8229 | 2ac1d67a |             | service-status                                      | ok       |                                                                                                                             |
| rda_infra | asm                                    | ab172a9b8229 | 2ac1d67a |             | minio-connectivity                                  | ok       |                                                                                                                             |
| rda_app   | asset-dependency                       | 6ac69ca1085c | c2e9dcb9 |             | service-status                                      | ok       |                                                                                                                             |
| rda_app   | asset-dependency                       | 6ac69ca1085c | c2e9dcb9 |             | minio-connectivity                                  | ok       |                                                                                                                             |
| rda_app   | asset-dependency                       | 58a5f4f460d3 | 0b91caac |             | service-status                                      | ok       |                                                                                                                             |
| rda_app   | asset-dependency                       | 58a5f4f460d3 | 0b91caac |             | minio-connectivity                                  | ok       |                                                                                                                             |
| rda_app   | authenticator                          | 9011c2aef498 | 9f7efdc3 |             | service-status                                      | ok       |                                                                                                                             |
| rda_app   | authenticator                          | 9011c2aef498 | 9f7efdc3 |             | minio-connectivity                                  | ok       |                                                                                                                             |
| rda_app   | authenticator                          | 9011c2aef498 | 9f7efdc3 |             | DB-connectivity                                     | ok       |                                                                                                                             |
| rda_app   | authenticator                          | 148621ed8c82 | dbf16b82 |             | service-status                                      | ok       |                                                                                                                             |
| rda_app   | authenticator                          | 148621ed8c82 | dbf16b82 |             | minio-connectivity                                  | ok       |                                                                                                                             |
| rda_app   | authenticator                          | 148621ed8c82 | dbf16b82 |             | DB-connectivity                                     | ok       |                                                                                                                             |
| rda_app   | cfx-app-controller                     | 75ec0f30cfa3 | 1198fdee |             | service-status                                      | ok       |                                                                                                                             |
| rda_app   | cfx-app-controller                     | 75ec0f30cfa3 | 1198fdee |             | minio-connectivity                                  | ok       |                                                                                                                             |
| rda_app   | cfx-app-controller                     | 75ec0f30cfa3 | 1198fdee |             | service-initialization-status                       | ok       |                                                                                                                             |
| rda_app   | cfx-app-controller                     | 75ec0f30cfa3 | 1198fdee |             | DB-connectivity                                     | ok       |                          
+-----------+----------------------------------------+--------------+----------+-------------+-----------------------------------------------------+----------+-----------------------------------------------------------------------------------------------------------------------------+

1.2.4 Upgrade OIA Application Services

Run the below commands to initiate upgrading the RDA Fabric OIA Application services with zero downtime

rdaf app upgrade OIA --tag 8.1.0.6 --service cfx-rda-app-controller --service cfx-rda-alert-processor --service cfx-rda-alert-correlator --service cfx-rda-irm-service --rolling-upgrade --timeout 10

Note

timeout <10> mentioned in the above command represents as Seconds

Note

The rolling-upgrade option upgrades the OIA application services running in high-availability mode on one VM at a time in sequence. It completes the upgrade of OIA application services running on VM-1 before upgrading them on VM-2, followed by VM-3, and so on.

After completing the OIA application services upgrade on all VMs, it will ask for user confirmation to delete the older version OIA application service PODs.

2025-10-30 11:23:36,923 [rdaf.component.oia] INFO     - Gathering OIA app container details.
2025-10-30 11:23:38,026 [rdaf.component.oia] INFO     - Gathering rdac pod details.
+----------+-----------------------+---------+---------+--------------+-------------+------------+
| Pod ID   | Pod Type              | Version | Age     | Hostname     | Maintenance | Pod Status |
+----------+-----------------------+---------+---------+--------------+-------------+------------+
| 1c14f6eb | alert-ingester        | 8.1.0.x | 4:29:25 | c236e62139f8 | None        | True       |
| 66ed0060 | alert-processor       | 8.1.0.x | 4:27:14 | c6672a224c54 | None        | True       |
| b95a768f | event-consumer        | 8.1.0.x | 4:27:48 | 3f4516b3e057 | None        | True       |
| acecb0b7 | alert-processor-      | 8.1.0.x | 4:23:44 | 02cf84e94ea7 | None        | True       |
|          | companion             |         |         |              |             |            |
| 75c30a77 | alert-correlator      | 8.1.0.x | 4:26:41 | 895c6b108728 | None        | True       |
| 73cc3ae8 | cfxdimensions-app-    | 8.1.0.x | 4:24:55 | b8d988286bf3 | None        | True       |
|          | collaboration         |         |         |              |             |            |
+----------+-----------------------+---------+---------+--------------+-------------+------------+
Continue moving above pods to maintenance mode? [yes/no]: yes
2025-10-30 11:23:55,371 [rdaf.component.oia] INFO     - Initiating Maintenance Mode...
2025-10-30 11:24:18,171 [rdaf.component.oia] INFO     - Following container are in maintenance mode
+----------+-----------------------+---------+---------+--------------+-------------+------------+
| Pod ID   | Pod Type              | Version | Age     | Hostname     | Maintenance | Pod Status |
+----------+-----------------------+---------+---------+--------------+-------------+------------+
| 75c30a77 | alert-correlator      | 8.1.0.x | 4:27:06 | 895c6b108728 | maintenance | False      |
| 1c14f6eb | alert-ingester        | 8.1.0.x | 4:29:50 | c236e62139f8 | maintenance | False      |
| 66ed0060 | alert-processor       | 8.1.0.x | 4:27:39 | c6672a224c54 | maintenance | False      |
| acecb0b7 | alert-processor-      | 8.1.0.x | 4:24:09 | 02cf84e94ea7 | maintenance | False      |
|          | companion             |         |         |              |             |            |
| 73cc3ae8 | cfxdimensions-app-    | 8.1.0.x | 4:25:21 | b8d988286bf3 | maintenance | False      |
|          | collaboration         |         |         |              |             |            |
| b95a768f | event-consumer        | 8.1.0.x | 4:28:13 | 3f4516b3e057 | maintenance | False      |
+----------+-----------------------+---------+---------+--------------+-------------+------------+

Run the below command to initiate upgrading the RDA Fabric OIA Application services without zero downtime

rdaf app upgrade OIA --tag 8.1.0.6

Please wait till all of the new OIA application service containers are in Up state and run the below command to verify their status and make sure they are running with 8.1.0.6 version.

rdaf app status
+--------------------------+-----------------+-----------+--------------+---------+
| Name                     | Host            | Status    | Container Id | Tag     |
+--------------------------+-----------------+-----------+--------------+---------+
| cfx-rda-app-controller   | 192.168.108.127 | Up 6 days | f6447547ee74 | 8.1.0.6 |
| cfx-rda-app-controller   | 192.168.108.128 | Up 6 days | a0b93ed591b4 | 8.1.0.6 |
| cfx-rda-alert-processor  | 192.168.108.127 | Up 6 days | 8a7ccbe6fce6 | 8.1.0.6 |
| cfx-rda-alert-processor  | 192.168.108.128 | Up 6 days | b1cbecda63cb | 8.1.0.6 |
| cfx-rda-alert-correlator | 192.168.108.127 | Up 6 days | 5d8316cf08b4 | 8.1.0.6 |
| cfx-rda-alert-correlator | 192.168.108.128 | Up 6 days | 863f40e26df1 | 8.1.0.6 |
| cfx-rda-irm-service      | 192.168.108.127 | Up 6 days | 164f38367d0e | 8.1.0.6 |
| cfx-rda-irm-service      | 192.168.108.128 | Up 6 days | 4ea62df19baf | 8.1.0.6 |
+--------------------------+-----------------+-----------+--------------+---------+

Run the below command to verify all OIA application services are up and running.

rdac pods
+-------+----------------------------------------+-------------+----------------+----------+-------------+----------+--------+--------------+---------------+--------------+
| Cat   | Pod-Type                               | Pod-Ready   | Host           | ID       | Site        | Age      |   CPUs |   Memory(GB) | Active Jobs   | Total Jobs   |
|-------+----------------------------------------+-------------+----------------+----------+-------------+----------+--------+--------------+---------------+--------------|
| App   | alert-ingester                         | True        | rda-alert-inge | 6a6e464d |             | 19:22:36 |      8 |        31.33 |               |              |
| App   | alert-ingester                         | True        | rda-alert-inge | 7f6b42a0 |             | 19:22:53 |      8 |        31.33 |               |              |
| App   | alert-processor                        | True        | rda-alert-proc | a880e491 |             | 19:23:21 |      8 |        31.33 |               |              |
| App   | alert-processor                        | True        | rda-alert-proc | b684609e |             | 19:23:18 |      8 |        31.33 |               |              |
| App   | alert-processor-companion              | True        | rda-alert-proc | 874f3b33 |             | 19:22:24 |      8 |        31.33 |               |              |
| App   | alert-processor-companion              | True        | rda-alert-proc | 70cadaa7 |             | 19:22:05 |      8 |        31.33 |               |              |
| App   | asset-dependency                       | True        | rda-asset-depe | bde06c15 |             | 19:47:50 |      8 |        31.33 |               |              |
| App   | asset-dependency                       | True        | rda-asset-depe | 47b9eb02 |             | 19:47:38 |      8 |        31.33 |               |              |
| App   | authenticator                          | True        | rda-identity-d | faa33e1b |             | 19:47:52 |      8 |        31.33 |               |              |
| App   | authenticator                          | True        | rda-identity-d | 36083c36 |             | 19:47:46 |      8 |        31.33 |               |              |
| App   | cfx-app-controller                     | True        | rda-app-contro | 5fd3c3f4 |             | 19:23:09 |      8 |        31.33 |               |              |
| App   | cfx-app-controller                     | True        | rda-app-contro | d66e5ce8 |             | 19:22:56 |      8 |        31.33 |               |              |
| App   | cfxdimensions-app-access-manager       | True        | rda-access-man | ecbb535c |             | 19:47:46 |      8 |        31.33 |               |              |
| App   | cfxdimensions-app-access-manager       | True        | rda-access-man | 9a05db5a |             | 19:47:36 |      8 |        31.33 |               |              |
| App   | cfxdimensions-app-collaboration        | True        | rda-collaborat | 61b3c53b |             | 19:22:18 |      8 |        31.33 |               |              |
| App   | cfxdimensions-app-collaboration        | True        | rda-collaborat | 09b9474e |             | 19:21:57 |      8 |        31.33 |               |              |
| App   | cfxdimensions-app-file-browser         | True        | rda-file-brows | 00495640 |             | 19:22:45 |      8 |        31.33 |               |              |
| App   | cfxdimensions-app-file-browser         | True        | rda-file-brows | 640f0653 |             | 19:22:29 |      8 |        31.33 |               |              |
| App   | cfxdimensions-app-irm_service          | True        | rda-irm-servic | 27e345c5 |             | 19:21:43 |      8 |        31.33 |               |              |
| App   | cfxdimensions-app-irm_service          | True        | rda-irm-servic | 23c7e082 |             | 19:21:56 |      8 |        31.33 |               |              |
| App   | cfxdimensions-app-notification-service | True        | rda-notificati | bbb5b08b |             | 19:23:20 |      8 |        31.33 |               |              |
| App   | cfxdimensions-app-notification-service | True        | rda-notificati | 9841bcb5 |             | 19:23:02 |      8 |        31.33 |               |              |
+-------+----------------------------------------+-------------+----------------+----------+-------------+----------+--------+--------------+---------------+--------------+

Run the below command to check if all services has ok status and does not throw any failure messages.

rdac healthcheck
+-----------+----------------------------------------+--------------+----------+-------------+-----------------------------------------------------+----------+-------------------------------------------------------------+
| Cat       | Pod-Type                               | Host         | ID       | Site        | Health Parameter                                    | Status   | Message                                                     |
|-----------+----------------------------------------+--------------+----------+-------------+-----------------------------------------------------+----------+-------------------------------------------------------------|
| rda_app   | alert-ingester                         | 7f75047e9e44 | daa8c414 |             | service-status                                      | ok       |                                                             |
| rda_app   | alert-ingester                         | 7f75047e9e44 | daa8c414 |             | minio-connectivity                                  | ok       |                                                             |
| rda_app   | alert-ingester                         | 7f75047e9e44 | daa8c414 |             | service-dependency:configuration-service            | ok       | 2 pod(s) found for configuration-service                    |
| rda_app   | alert-ingester                         | 7f75047e9e44 | daa8c414 |             | service-initialization-status                       | ok       |                                                             |
| rda_app   | alert-ingester                         | 7f75047e9e44 | daa8c414 |             | kafka-connectivity                                  | ok       | Cluster=NTc1NWU1MTQxYmY3MTFlZg, Broker=1, Brokers=[1, 2, 3] |
| rda_app   | alert-ingester                         | f9ec55862be0 | f9b9231c |             | service-status                                      | ok       |                                                             |
| rda_app   | alert-ingester                         | f9ec55862be0 | f9b9231c |             | minio-connectivity                                  | ok       |                                                             |
| rda_app   | alert-ingester                         | f9ec55862be0 | f9b9231c |             | service-dependency:configuration-service            | ok       | 2 pod(s) found for configuration-service                    |
| rda_app   | alert-ingester                         | f9ec55862be0 | f9b9231c |             | service-initialization-status                       | ok       |                                                             |
| rda_app   | alert-ingester                         | f9ec55862be0 | f9b9231c |             | kafka-connectivity                                  | ok       | Cluster=NTc1NWU1MTQxYmY3MTFlZg, Broker=2, Brokers=[1, 2, 3] |
| rda_app   | alert-processor                        | c6cc7b04ab33 | b4ebfb06 |             | service-status                                      | ok       |                                                             |
| rda_app   | alert-processor                        | c6cc7b04ab33 | b4ebfb06 |             | minio-connectivity                                  | ok       |                                                             |
+-----------+----------------------------------------+--------------+----------+-------------+-----------------------------------------------------+----------+-------------------------------------------------------------+

1.2.5 Upgrade Script

  • Please download the below python script (rdaf_upgrade_1411_1412.py)
wget https://macaw-amer.s3.us-east-1.amazonaws.com/releases/rdaf-platform/1.4.1.2/rdaf_upgrade_1411_1412.py
  • Execute the following upgrade script on the CLI VM.
python rdaf_upgrade_1411_1412.py upgrade

After running the upgrade script, it will automatically back up the existing my_custom.cnf and haproxy.cfg files across all VMs where they are deployed, and generate new versions of both files incorporating the required changes. The updated configurations produced by the script are documented below.

haproxy.cfg & my_custom.cnf

Note

Click below to view the configuration changes that have been highlighted for haproxy.cfg and my_custom.cnf.

haproxy.cfg
global
    nbthread 8
    cpu-map auto:1-8 0-7
    maxconn 20000
    log 127.0.0.1 local2
    chroot /var/lib/haproxy
    stats socket /run/haproxy/admin.sock mode 660 level admin
    stats timeout 30s
    user haproxy
    group haproxy
    external-check
    insecure-fork-wanted
    ssl-default-bind-options no-sslv3 no-tls-tickets force-tlsv12
    ssl-default-bind-ciphers EECDH+AESGCM:EDH+AESGCM:AES256+EECDH:AES256+EDH
    ssl-server-verify none
    tune.ssl.default-dh-param 2048
    tune.ssl.cachesize 100000
    tune.ssl.lifetime 600
    tune.ssl.maxrecord 16384
    tune.http.maxhdr 1024
    max-spread-checks 5s
    spread-checks 10

defaults
    log global
    mode http
    retries 3
    maxconn 20000
    timeout connect         5s
    timeout client         900s
    timeout server         900s
    timeout http-request    60s
    timeout http-keep-alive 60s
    timeout queue           30s
    option httplog
    option log-separate-errors
    option log-health-checks
    option redispatch
    option http-keep-alive
    option forwardfor
    option tcp-smart-accept
    option tcp-smart-connect

frontend stats
    mode http
    bind *:7222
    stats enable
    stats uri /stats
    stats refresh 60s

frontend minio
    mode http
    bind *:9443 ssl crt /opt/certificates/haproxy.pem alpn h2,http/1.1
    http-request set-header X-Forwarded-Port %[dst_port]
    http-request set-header X-Forwarded-Proto https if { ssl_fc }
    redirect scheme https if !{ ssl_fc }
    default_backend minio

frontend mariadb
    bind *:3307
    timeout client 660s
    mode tcp
    default_backend mariadb
    maxconn 2000

frontend portal
    bind *:80
    bind *:443 ssl crt /opt/certificates/haproxy.pem
    acl WEBHOOK_PATH path_beg -i /webhooks/
    use_backend webhook if WEBHOOK_PATH
    timeout client 30s
    mode http
    rate-limit sessions 250
    http-request set-header X-Forwarded-Port %[dst_port]
    http-request set-header X-Forwarded-Proto https if { ssl_fc }
    redirect scheme https unless { ssl_fc }
    default_backend portal

backend portal
    mode http
    balance roundrobin
    http-check send meth HEAD  uri /
    http-check expect rstatus (2|3)[0-9][0-9]
    http-check disable-on-404
    default-server inter 10s downinter 5s
    cookie rdafportal insert indirect nocache maxidle 30m maxlife 24h
    # Issue: https://github.com/cloudfabrix/rda/issues/2560
    # On a setup where ui prefix is configured but portal is accessed from internal directly,
    # Some icons may be addressed using aiops or aips-uat prefix.
    # Which will not work because they are not going through nginx proxy replacement
    # These requests directly come to haproxy instead of through nginx running on 8443
    http-request set-path "%[path,regsub(^/aiops/,/)]"
    http-request set-path "%[path,regsub(^aiops/,/)]"
    http-request set-path "%[path,regsub(^/aiops-uat/,/)]"
    http-request set-path "%[path,regsub(^aiops-uat/,/)]"
    server portal-192.168.108.125 192.168.108.125:7780 check cookie rdaf-portal-1
    server portal-192.168.108.126 192.168.108.126:7780 check cookie rdaf-portal-2

frontend rda-server
    bind *:8808
    mode http
    http-request set-header X-Forwarded-Port %[dst_port]
    default_backend rda-server

backend rda-server
    mode http
    balance roundrobin
    http-check send meth HEAD  uri /rdac
    http-check expect rstatus (2|3)[0-9][0-9]
    http-check disable-on-404
    default-server inter 10s downinter 5s
    server rda-api-server-192.168.108.125 192.168.108.125:8807
    server rda-api-server-192.168.108.126 192.168.108.126:8807

backend mariadb
    mode tcp
    balance roundrobin
    option tcpka
    timeout server 660s
    timeout connect 10s
    default-server inter 10s downinter 5s
    option external-check
    external-check command /maria_cluster_check
    server mariadb-192.168.108.122 192.168.108.122:3306 check backup
    server mariadb-192.168.108.123 192.168.108.123:3306 check
    server mariadb-192.168.108.124 192.168.108.124:3306 check backup

backend minio
    mode http
    balance roundrobin
    option forwardfor
    http-check send meth HEAD  uri /minio/health/live ver HTTP/1.1 hdr host localhost
    http-check expect rstatus (2|3)[0-9][0-9]
    http-check disable-on-404
    default-server inter 10s downinter 5s
    cookie mnserverid insert indirect nocache maxidle 30m maxlife 24h
    server minio-192.168.108.122 192.168.108.122:9000 check cookie rdaf-objstr-1
    server minio-192.168.108.123 192.168.108.123:9000 check cookie rdaf-objstr-2
    server minio-192.168.108.124 192.168.108.124:9000 check cookie rdaf-objstr-3
    server minio-192.168.108.125 192.168.108.125:9000 check cookie rdaf-objstr-4



backend webhook
    mode http
    balance roundrobin
    stick-table type ip size 10k expire 10m
    stick on src
    option httpchk GET /healthcheck
    http-check expect rstatus (2|3)[0-9][0-9]
    http-check disable-on-404
    http-response set-header Cache-Control no-store
    http-response set-header Pragma no-cache
    default-server inter 10s downinter 5s fall 3 rise 2
    cookie SERVERID insert indirect nocache maxidle 30m maxlife 24h httponly secure
    server rdaf-webhook-1 192.168.108.127:8888 check cookie rdaf-webhook-1
    server rdaf-webhook-2 192.168.108.128:8888 check cookie rdaf-webhook-2

frontend smtp
    bind 0.0.0.0:25
    mode tcp
    timeout client 1m
    log global
    option tcplog
    default_backend smtp

backend smtp
    mode tcp
    log global
    option tcplog
    timeout server 1m
    timeout connect 7s
    server rdaf-smtp-1 192.168.108.127:8456
    server rdaf-smtp-2 192.168.108.128:8456
my_custom.cnf
[mysqld]
transaction_isolation=READ-COMMITTED
binlog_format=ROW
#Logging
log_error                       = /opt/rdaf/log/mariadb.log
log_queries_not_using_indexes   = 1
long_query_time                 = 5
slow_query_log                  = 0     # Disabled for production
slow_query_log_file             = /opt/rdaf/log/mariadb-slow.log

#Log expiry
expire_logs_days=1

max_connections=256
max_connect_errors=1000000
connect_timeout=10
max_allowed_packet=128M
# The wait_timeout system variable sets the time in seconds that the
# server waits for an idle interactive connection to become active before closing it.
wait_timeout=720
interactive_timeout=720
net_read_timeout=300
net_write_timeout=300
idle_transaction_timeout=300


#InnoDB tables
innodb_buffer_pool_size=2G
#Log File should be .25 of Buffer pool Size.
innodb_log_file_size=1G
innodb_log_buffer_size=64M
innodb_file_per_table=1
innodb_flush_log_at_trx_commit=1
innodb_lock_wait_timeout=5
innodb_purge_threads=4
innodb_write_io_threads=8
innodb_read_io_threads=8
innodb_flush_method=O_DIRECT
innodb_max_dirty_pages_pct=10
innodb_max_dirty_pages_pct_lwm=5
innodb_io_capacity=600
innodb_print_all_deadlocks=ON

[galera]
wsrep_provider_options="evs.suspect_timeout=PT30S; evs.inactive_timeout=PT45S; evs.inactive_check_period=PT15S"
rdauser@infra108122:~$ python rdaf_upgrade_1411_1412.py upgrade
Creating backup of haproxy.cfg on 192.168.108.122
Updating haproxy configuration on 192.168.108.122
HAProxy configuration updated on 192.168.108.122
Creating backup of haproxy.cfg on 192.168.108.123
Updating haproxy configuration on 192.168.108.123
HAProxy configuration updated on 192.168.108.123
Creating backup of my_custom.cnf on 192.168.108.122
Updating MariaDB configuration on 192.168.108.122
MariaDB configuration updated on 192.168.108.122
Creating backup of my_custom.cnf on 192.168.108.123
Updating MariaDB configuration on 192.168.108.123
MariaDB configuration updated on 192.168.108.123
Creating backup of my_custom.cnf on 192.168.108.124
Updating MariaDB configuration on 192.168.108.124
MariaDB configuration updated on 192.168.108.124
rdauser@infra108122:~$ 

Note

If the customer environment is running under high load, they should reach out to the Fabrix.ai support team to further tune the parameters based on the workload. After receiving the my_custom.cnf and haproxy.cfg files, please follow the below steps.

Copy the new my_custom.cnf file on all MariaDB VMs

  • Log in to all the VMs where the MariaDB containers are running, one by one.

  • Navigate to the below directory.

    /opt/rdaf/config/mariadb/
    

  • Copy the new my_custom.cnf file into this directory.

Copy the new haproxy.cfg file on all HAProxy VMs

  • Log in to all the VMs where the HAProxy containers are running, one by one.

  • Navigate to the below directory.

    /opt/rdaf/config/haproxy/
    

  • Copy the new haproxy.cfg file into this directory.

Restart MariaDB Manually

  • Restart MariaDB manually by using the below steps.

MariaDB nodes must be restarted manually, Follow the steps below to manually restart the MariaDB cluster

Step 1. Login to the CLI VM and run the below given command to get the MariaDB nodes order.

vi /opt/rdaf/rdaf.cfg
[mariadb]
datadir = 192.168.108.50/var/mysql,192.168.108.56/var/mysql,192.168.108.58/var/mysql
user = xxxxxxxxxxxxxx
password = xxxxxxxxxxx
host = 192.168.108.50,192.168.108.56,192.168.108.58
master_id = 0

Step 2. Log in to all the MariaDB nodes and identify the node with the bootstrap configuration. Navigate to the below-mentioned path to find the bootstrap information.

/opt/rdaf/deployment-scripts/<Host-IP>/infra.yaml
  • Under the mariadb section, Please update.

MARIADB_GALERA_CLUSTER_BOOTSTRAP=yes
To

MARIADB_GALERA_CLUSTER_BOOTSTRAP=no
rdauser@infra108122:~$ vi /opt/rdaf/rdaf.cfg
rdauser@infra108122:~$ vi /opt/rdaf/deployment-scripts/192.168.108.122/infra.yaml

mariadb:
    image: 192.168.108.122:5000/rda-platform-mariadb:1.0.4
    restart: 'no'
    network_mode: host
    mem_limit: 8G
    memswap_limit: 8G
    oom_kill_disable: false
    volumes:
    - /var/mysql:/bitnami/mariadb/data/
    - /opt/rdaf/config/mariadb:/opt/bitnami/mariadb/conf/bitnami/
    - /opt/rdaf/logs/mariadb:/opt/rdaf/log/
    logging:
      driver: json-file
      options:
        max-size: 10m
        max-file: '5'
    environment:
    - MARIADB_GALERA_MARIABACKUP_USER=rdaf_backup
    - MARIADB_GALERA_MARIABACKUP_PASSWORD=rdaf_backup
    - MARIADB_GALERA_NODE_ADDRESS=192.168.108.122
    - MARIADB_GALERA_NODE_NAME=192.168.108.122
    - MARIADB_USER=rdaf
    - MARIADB_PASSWORD=rdaf123!
    - MARIADB_ROOT_PASSWORD=rdaf123!
    - MARIADB_GALERA_CLUSTER_NAME=rdaf_galera
    - MARIADB_REPLICATION_USER=rdaf_replica
    - MARIADB_REPLICATION_PASSWORD=rdaf_replica
    - MARIADB_GALERA_CLUSTER_ADDRESS=gcomm://192.168.108.122,192.168.108.123,192.168.108.124
    - MARIADB_GALERA_CLUSTER_BOOTSTRAP=no

Step 3. The above Bootstrap information mentioned in Step 2, the MariaDB container running on that VM has to be restarted at the end. Restart MariaDB (Reverse Order - Start with Third Node), Login to the third node and restart the container.

docker ps -a | grep maria
docker stop -t 120 <container_id>
docker start <container_id>
rdauser@infra108124:~$ docker ps -a | grep maria
ac633bfab6c4   192.168.108.122:5000/rda-platform-mariadb:1.0.4            "/opt/bitnami/script…"   25 hours ago   Up 25 hours             infra-mariadb-1
rdauser@infra108124:~$ docker stop -t 120 ac633bfab6c4
ac633bfab6c4
rdauser@infra108124:~$ docker start ac633bfab6c4
ac633bfab6c4
rdauser@infra108124:~$ cd /opt/rdaf/logs/mariadb/
rdauser@infra108124:/opt/rdaf/logs/mariadb$ ls
auto-restart.log  auto-restart.log.1.gz  mariadb-slow.log  mariadb-slow.log.1.gz  mariadb.log  mariadb.log.1.gz
rdauser@infra108124:/opt/rdaf/logs/mariadb$ grep WSREP *
mariadb.log:2026-04-24 13:12:41 0 [Note] WSREP: Member 1.0 (192.168.108.123) synced with group.
mariadb.log:2026-04-24 13:12:41 0 [Note] WSREP: Processing event queue:... 100.0% (1/1 events) complete.
mariadb.log:2026-04-24 13:12:41 0 [Note] WSREP: Shifting JOINED -> SYNCED (TO: 3060)
mariadb.log:2026-04-24 13:12:41 2 [Note] WSREP: Server 192.168.108.123 synced with group
mariadb.log:2026-04-24 13:12:41 2 [Note] WSREP: Server status change joined -> synced
mariadb.log:2026-04-24 13:12:41 2 [Note] WSREP: Synchronized with group, ready for connections
mariadb.log:2026-04-24 13:12:41 2 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification.

Step 4. Run the following MySQL commands to verify that the MariaDB node has successfully rejoined the Galera cluster.

mysql -h 192.168.108.123 -P 3306 -urdaf -prdaf123!
rdauser@infra108123:/opt/rdaf/logs/mariadb$ mysql -h 192.168.108.123 -P 3306 -urdaf -prdaf123!
mysql: [Warning] Using a password on the command line interface can be insecure.
Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 59
Server version: 11.4.5-MariaDB-log Source distribution

Copyright (c) 2000, 2026, Oracle and/or its affiliates.

Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.
  • Please use the below command to check if the node is connected to the cluster.
mysql> SHOW STATUS LIKE 'wsrep_connected';
+-----------------+-------+
| Variable_name   | Value |
+-----------------+-------+
| wsrep_connected | ON    |
+-----------------+-------+
1 row in set (0.00 sec)
  • Use the command given below to verify the node is ready to accept queries.
mysql> SHOW STATUS LIKE 'wsrep_ready';
+---------------+-------+
| Variable_name | Value |
+---------------+-------+
| wsrep_ready   | ON    |
+---------------+-------+
1 row in set (0.00 sec)
  • Please use the below command to check the local node sync state.
mysql> SHOW STATUS LIKE 'wsrep_local_state_comment';
+---------------------------+--------+
| Variable_name             | Value  |
+---------------------------+--------+
| wsrep_local_state_comment | Synced |
+---------------------------+--------+
1 row in set (0.01 sec)
  • Use the command given below to confirm all three nodes are part of the cluster.
mysql> SHOW GLOBAL STATUS LIKE 'wsrep_cluster_size';
+--------------------+-------+
| Variable_name      | Value |
+--------------------+-------+
| wsrep_cluster_size | 3     |
+--------------------+-------+
1 row in set (0.01 sec)
  • Please use the below command to verify the cluster is in Primary state.
mysql> SHOW STATUS LIKE 'wsrep_cluster_status';`
+----------------------+---------+
| Variable_name        | Value   |
+----------------------+---------+
| wsrep_cluster_status | Primary |
+----------------------+---------+
1 row in set (0.00 sec)
  • Use the command given below to check the local node index within the cluster.
mysql> SHOW STATUS LIKE 'wsrep_local_index';
+-------------------+-------+
| Variable_name     | Value |
+-------------------+-------+
| wsrep_local_index | 1     |
+-------------------+-------+
1 row in set (0.00 sec)
  • Please use the below command to verify all node addresses are listed.
mysql> SHOW STATUS LIKE 'wsrep_incoming_addresses';
+--------------------------+----------------------------------------------------------+
| Variable_name            | Value                                                    |
+--------------------------+----------------------------------------------------------+
| wsrep_incoming_addresses | 192.168.108.122:3306,192.168.108.123:3306,192.168.108.124:3306 |
+--------------------------+----------------------------------------------------------+
1 row in set (0.00 sec)

mysql> exit
Bye

Step 5. Log in to the second node and restart the MariaDB container.

docker stop -t 120 <container_id>
docker start <container_id>
  • After restarting the container, use the tail command to monitor the MariaDB logs for the highlighted line.
rdauser@infra108124:~$ docker ps -a | grep maria
ac633bfab6c4   192.168.108.122:5000/rda-platform-mariadb:1.0.4            "/opt/bitnami/script…"   25 hours ago   Up 25 hours             infra-mariadb-1
rdauser@infra108124:~$ docker stop -t 120 ac633bfab6c4
ac633bfab6c4
rdauser@infra108124:~$ docker start ac633bfab6c4
ac633bfab6c4
rdauser@infra108124:~$ cd /opt/rdaf/logs/mariadb/
rdauser@infra108124:/opt/rdaf/logs/mariadb$ ls
auto-restart.log  auto-restart.log.1.gz  mariadb-slow.log  mariadb-slow.log.1.gz  mariadb.log  mariadb.log.1.gz
rdauser@infra108124:/opt/rdaf/logs/mariadb$ grep WSREP *
mariadb.log:2026-04-24 13:12:41 0 [Note] WSREP: Member 1.0 (192.168.108.123) synced with group.
mariadb.log:2026-04-24 13:12:41 0 [Note] WSREP: Processing event queue:... 100.0% (1/1 events) complete.
mariadb.log:2026-04-24 13:12:41 0 [Note] WSREP: Shifting JOINED -> SYNCED (TO: 3060)
mariadb.log:2026-04-24 13:12:41 2 [Note] WSREP: Server 192.168.108.123 synced with group
mariadb.log:2026-04-24 13:12:41 2 [Note] WSREP: Server status change joined -> synced
mariadb.log:2026-04-24 13:12:41 2 [Note] WSREP: Synchronized with group, ready for connections
mariadb.log:2026-04-24 13:12:41 2 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification.
  • Once the container is restarted, run the MySQL verification commands from Step 4 to confirm the node has rejoined the cluster.

Step 6. Restart MariaDB — First Node (bootstrap node)

  • Log in to the first MariaDB node and stop the MariaDB container, first navigate to the infra.yaml file path cd /opt/rdaf/deployment-scripts/<Host IP>/

  • To stop the MariaDB container use the below command.

docker-compose -f infra.yaml --project-name infra rm -fsv mariadb
  • To start the MariaDB container use the below command.
docker-compose -f infra.yaml --project-name infra up -d mariadb
rdauser@infra108122:~$ cd /opt/rdaf/deployment-scripts/192.168.108.122/
rdauser@infra108122:/opt/rdaf/deployment-scripts/192.168.108.122$ ls
infra.yaml
rdauser@infra108122:/opt/rdaf/deployment-scripts/192.168.108.122$ vi infra.yaml
rdauser@infra108122:/opt/rdaf/deployment-scripts/192.168.108.122$ vi infra.yaml
rdauser@infra108122:/opt/rdaf/deployment-scripts/192.168.108.122$ docker-compose -f infra.yaml --project-name infra rm -fsv mariadb
[+] Stopping 1/1
 ✔ Container infra-mariadb-1  Stopped                                                                                                                                          0.6s
rdauser@infra108122:/opt/rdaf/deployment-scripts/192.168.108.122$ docker-compose -f infra.yaml --project-name infra up -d mariadb
[+] Running 1/1
 ✔ Container infra-mariadb-1  Started                                
rdauser@infra108122:/opt/rdaf/logs/mariadb$ grep WSREP *
mariadb.log:2026-04-24 13:12:41 0 [Note] WSREP: Member 1.0 (192.168.108.123) synced with group.
mariadb.log:2026-04-24 13:12:41 0 [Note] WSREP: Processing event queue:... 100.0% (1/1 events) complete.
mariadb.log:2026-04-24 13:12:41 0 [Note] WSREP: Shifting JOINED -> SYNCED (TO: 3060)
mariadb.log:2026-04-24 13:12:41 2 [Note] WSREP: Server 192.168.108.123 synced with group
mariadb.log:2026-04-24 13:12:41 2 [Note] WSREP: Server status change joined -> synced
mariadb.log:2026-04-24 13:12:41 2 [Note] WSREP: Synchronized with group, ready for connections
mariadb.log:2026-04-24 13:12:41 2 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification.

Step 7. Verify All Nodes Are Synced

  • Once the first node is up, run the MySQL verification commands from Step 4 to confirm all nodes have rejoined and are in sync.

Step 9. Restore Bootstrap Configuration on First Node

  • Navigate to /opt/rdaf/deployment-scripts/192.168.108.122/infra.yaml and update like below in the MariaDB section.

  • Under the mariadb section, Please update.

From

MARIADB_GALERA_CLUSTER_BOOTSTRAP=no

To

MARIADB_GALERA_CLUSTER_BOOTSTRAP=yes 
mariadb:
    image: 192.168.108.122:5000/rda-platform-mariadb:1.0.4
    restart: 'no'
    network_mode: host
    mem_limit: 8G
    memswap_limit: 8G
    oom_kill_disable: false
    volumes:
    - /var/mysql:/bitnami/mariadb/data/
    - /opt/rdaf/config/mariadb:/opt/bitnami/mariadb/conf/bitnami/
    - /opt/rdaf/logs/mariadb:/opt/rdaf/log/
    logging:
      driver: json-file
      options:
        max-size: 10m
        max-file: '5'
    environment:
    - MARIADB_GALERA_MARIABACKUP_USER=rdaf_backup
    - MARIADB_GALERA_MARIABACKUP_PASSWORD=rdaf_backup
    - MARIADB_GALERA_NODE_ADDRESS=192.168.108.122
    - MARIADB_GALERA_NODE_NAME=192.168.108.122
    - MARIADB_USER=rdaf
    - MARIADB_PASSWORD=rdaf123!
    - MARIADB_ROOT_PASSWORD=rdaf123!
    - MARIADB_GALERA_CLUSTER_NAME=rdaf_galera
    - MARIADB_REPLICATION_USER=rdaf_replica
    - MARIADB_REPLICATION_PASSWORD=rdaf_replica
    - MARIADB_GALERA_CLUSTER_ADDRESS=gcomm://192.168.108.122,192.168.108.123,192.168.108.124
    - MARIADB_GALERA_CLUSTER_BOOTSTRAP=yes
  • Restart HAProxy Containers

Step 10. Identify HAProxy VMs

First, determine which VMs are running the HAProxy instance and run the following command

rdaf infra status

Check the IP addresses where HAProxy is running.

Step 11. Login to HAProxy VMs and identify the Virtual IP

Now, login to the VM’s that are running haproxy and run the below given command.

ip addr show 

From the output, identify the Virtual IP (VIP) configured on the system. see below the highlighted line for VirtualIP in the example output

rdauser@infra108122:~$ ip addr show
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host noprefixroute
       valid_lft forever preferred_lft forever
2: ens160: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000
    link/ether 00:50:56:93:36:53 brd ff:ff:ff:ff:ff:ff
    altname enp3s0
    inet 192.168.108.122/24 brd 192.168.108.255 scope global ens160
       valid_lft forever preferred_lft forever
    inet 192.168.108.129/24 scope global secondary ens160
       valid_lft forever preferred_lft forever
    inet6 fe80::250:56ff:fe93:3653/64 scope link
       valid_lft forever preferred_lft forever

Note

In an HA environment, the VM where the Virtual IP (VIP) is present should be restarted last.

Restart HA Proxy Containers in all the VM's where they are deployed.

Step 12. Restart HAProxy containers one by one on each VM using below given dockercommands

docker ps -a | grep hap
docker stop -t 120 <container_id>
docker start <container_id>
rdauser@infra108122:~$ docker ps -a | grep hap
9567576aa14c   192.168.108.122:5000/rda-platform-haproxy:1.0.4                           "/docker-entry-point…"   26 hours ago    Up 4 hours                                        infra-haproxy-1
rdauser@infra108122:~$ docker stop -t 120 9567576aa14c
9567576aa14c
rdauser@infra108122:~$ docker start 9567576aa14c
9567576aa14c
rdauser@infra108122:~$

Step 13. Please verify HAProxy is healthy by using the command given below, all HAProxy and keepalived entries should show a running/active state

rdaf infra status
| haproxy    | 192.168.108.122 | Up 2 days | 9567576aa14c | 1.0.4 |
| haproxy    | 192.168.108.123 | Up 2 days | 6a0001f08b82 | 1.0.4 |
| keepalived | 192.168.108.122 | active    | N/A          | N/A   |
| keepalived | 192.168.108.123 | active    | N/A          | N/A   |

Step 14. Run the healthcheck using the below given command, all checks should return OK

rdaf infra healthcheck
| haproxy    | Port Connection | OK | N/A | 192.168.108.122 | 9567576aa14c |
| haproxy    | Service Status  | OK | N/A | 192.168.108.122 | 9567576aa14c |
| haproxy    | Firewall Port   | OK | N/A | 192.168.108.122 | 9567576aa14c |
| haproxy    | Port Connection | OK | N/A | 192.168.108.123 | 6a0001f08b82 |
| haproxy    | Service Status  | OK | N/A | 192.168.108.123 | 6a0001f08b82 |
| haproxy    | Firewall Port   | OK | N/A | 192.168.108.123 | 6a0001f08b82 |
| keepalived | Service Status  | OK | N/A | 192.168.108.122 | N/A          |
| keepalived | Service Status  | OK | N/A | 192.168.108.123 | N/A          

1.2.6 RDA Studio Upgrade

Please navigate to the rda-studio.yml file. You need to modify the existing tag version to 8.1.0.6, ensuring it matches the format shown in the example below, and then save the file

services:
cfxdx:
    image: docker1.cloudfabrix.io:443/external/ubuntu-cfxdx-nb-nginx-all:8.1.0.6
    restart: unless-stopped
    volumes:
    - /opt/rdaf/cfxdx/home/:/root
    - /opt/rdaf/cfxdx/config/:/tmp/config/
    - /opt/rdaf/cfxdx/output:/tmp/output/
    - /opt/rdaf/config/network_config/:/network_config
    ports:
    - "9998:9998"
    environment:
    #JUPYTER_TOKEN: cfxdxdemo
    NLTK_DATA : "/root/nltk_data"
    CFXDX_CONFIG_FILE: /tmp/config/conf.yml
    RDA_NETWORK_CONFIG: /network_config/config.json
    RDA_USER: xxxxxxx
    RDA_PASSWORD: xxxxxxxxxxxx

After updating the rda-studio.yml file to set the tag version to 8.1.0.6, execute the following commands to pull the latest images and start the services

docker-compose -f rda-studio.yml pull 
docker-compose -f rda-studio.yml up -d