Application Resource Group Always Failed Over to Secondary Node Real Application Cluster (RAC) with "smsNamingServer:Failed to start service" Error Message (Doc ID 1374840.1)

Last updated on OCTOBER 18, 2016

Applies to:

Oracle Communications Network Charging and Control - Version 4.3.0 and later
Information in this document applies to any platform.
***Checked for relevance on 16-03-2015***

Symptoms

On SMS( Service and managment System) node, Application resource group always fails over to the secondary Real Application Cluster (RAC) node.

Complete description is illustrated with below examples:

  1. New installation of smsCluster package on top of Oracle RAC environment.

    Example :

        1. Primary node (ncc-sms01)
        2. Secondary node (ncc-sms02)
     
  2. Switch over from secondary to primary node is failed with complete descriptions as below

    System log on ncc-sms02 node:
    -------------------------------

- All resources on ncc-sms02 become offline :

Cluster.RGM.global.rgmd: [ID 529407 daemon.notice] resource group SmsScreens-harg state on node ncc-sms02 change to RG_PENDING_OFFLINE
Cluster.RGM.global.rgmd: [ID 922363 daemon.notice] resource SmsNamingServer-hars status msg on node ncc-sms02 change to <Stopping>
Cluster.RGM.global.rgmd: [ID 529407 daemon.notice] resource group SmsScreens-harg state on node ncc-sms02 change to RG_OFFLINE

- All resources on ncc-sms01 starting to go online :

Cluster.RGM.global.rgmd: [ID 529407 daemon.notice] resource group SmsScreens-harg state on node ncc-sms01 change to RG_PENDING_ONLINE
Cluster.RGM.global.rgmd: [ID 443746 daemon.notice] resource SmsNamingServer-hars state on node ncc-sms01 change to R_STARTING
Cluster.RGM.global.rgmd: [ID 529407 daemon.notice] resource group SmsScreens-harg state on node ncc-sms01 change to RG_ONLINE

- in process start up the resources on ncc-sms01, smsNamingServer get a fault:

Cluster.RGM.global.rgmd: [ID 784560 daemon.notice] resource SmsNamingServer-hars status on node ncc-sms01 change to R_FM_FAULTED
Cluster.RGM.global.rgmd: [ID 922363 daemon.notice] resource SmsNamingServer-hars status msg on node ncc-sms01 change to <Service daemon not running.>
Cluster.RGM.global.rgmd: [ID 529407 daemon.notice] resource group SmsScreens-harg state on node ncc-sms01 change to RG_ON_PENDING_R_RESTART
Cluster.RGM.global.rgmd: [ID 443746 daemon.notice] resource SmsNamingServer-hars state on node ncc-sms01 change to R_ONLINE_UNMON
Cluster.RGM.global.rgmd: [ID 784560 daemon.notice] resource SmsNamingServer-hars status on node ncc-sms01 change to R_FM_UNKNOWN
Cluster.RGM.global.rgmd: [ID 922363 daemon.notice] resource SmsNamingServer-hars status msg on node ncc-sms01 change to <Stopping>

- Resources in ncc-sms01 is failed to established and failed over to ncc-sms02:

Cluster.RGM.global.rgmd: [ID 529407 daemon.error] resource group SmsScreens-harg state on node ncc-sms01 change to RG_OFFLINE_START_FAILED
Cluster.RGM.global.rgmd: [ID 529407 daemon.notice] resource group SmsScreens-harg state on node ncc-sms01 change to RG_OFFLINE
Cluster.RGM.global.rgmd: [ID 529407 daemon.notice] resource group SmsScreens-harg state on node ncc-sms02 change to RG_PENDING_ONLINE
Cluster.RGM.global.rgmd: [ID 224900 daemon.notice] launching method <hafoip_prenet_start> for resource <ncc-sms-screen>, resource group <SmsScreens-harg>, node <ncc-sms02>, timeout <300> seconds
Cluster.RGM.global.rgmd: [ID 784560 daemon.notice] resource ncc-cbt-sms-screen status on node ncc-sms02 change to R_FM_UNKNOWN
Cluster.RGM.global.rgmd: [ID 922363 daemon.notice] resource ncc-cbt-sms-screen status msg on node ncc-sms02 change to <Starting>
Cluster.RGM.global.rgmd: [ID 515159 daemon.notice] method <SmsNamingServer_monitor_start> completed successfully for resource <SmsNamingServer-hars>, resource group <SmsScreens-harg>, node <ncc-sms02>, time used: 0% of timeout <300 seconds>
Cluster.RGM.global.rgmd: [ID 443746 daemon.notice] resource SmsNamingServer-hars state on node ncc-sms02 change to R_ONLINE
Cluster.RGM.global.rgmd: [ID 922363 daemon.notice] resource SmsTaskAgent-hars status msg on node ncc-sms02 change to <Service is online.>

      3. At the same time in the system log file on ncc-sms01, the following error is logged

root@ncc-sms01$ tail /var/adm/messages
Cluster.PMF.pmfd
: [ID 887656 daemon.notice] Process: tag="SmsScreens-harg,SmsNamingServer-hars,0.svc", cmd="/bin/ksh -c /usr/bin/su - smf_oper -c 'exec /IN/service_packages/SMS/bin/smsNamingServerStartup.sh >> /IN/service_packages/SMS/tmp/smsNamingServer.log 2>/IN/service_packages/SMS/tmp/smsNamingServer.log'", Failed to stay up.
SC[.SmsNamingServer:4,SmsScreens-harg,SmsNamingServer-hars,SmsNamingServer_svc_start]: [ID 499150 daemon.error] Failed to start service.

      4. The following error is logged in smsNamingServer.log on ncc-sms01:

root@ncc-sms01$ cat smsNamingServer.log
/u01/app/oracle/product/9.2/lib32/libclntsh.so.9.0: Permission denied

 

Notes:
  • The System log files are located  in /var/adm/messages
  • The smsNamingServer log files are located in /IN/service_packages/SMS/tmp
  • To perform switchover :  -  login at Service Management System (SMS) as root user

                                  - execute command  "clrg switch -M -n <node_name> <resource_Group>" or  scswitch -z -g <resource_group> -h <node_name>

Changes

Recent installation of the smsCluster package.

Cause

Sign In with your My Oracle Support account

Don't have a My Oracle Support account? Click to get started

My Oracle Support provides customers with access to over a
Million Knowledge Articles and hundreds of Community platforms