Exadata: Cellsrv Service Not Restarting After Machine Reboot (Doc ID 1917968.1)

Last updated on AUGUST 25, 2014

Applies to:

Oracle Exadata Storage Server Software - Version to [Release 11.2 to 12.1]
Information in this document applies to any platform.


The service cellsrv fails to start after hardware maintenance on cell node, also symptomatic are:


---> Cell alert.log will reflect similar content to:

[RS] Process /opt/oracle/cell11. (pid: 10196) received exception [signal num: 14] [ADDR:0x0]
Sun Aug 17 12:52:06 2014
Sun Aug 17 12:52:06 2014State dump completed for Cellsrv<10199>
Sun Aug 17 12:52:06 2014
State dump signal delivered to Cellsrv<10199> by RS.
Sun Aug 17 12:52:11 2014
State dump interrupted for Cellsrv<10199> by RS.  It did not complete in 5 seconds.
Clean shutdown signal delivered to OSS<10199>
[RS] monitoring process /opt/oracle/cell11. (pid: 0) returned with error: 124



---> ms-odl.trc file will reflect similar content to:

[2014-08-17T12:57:54.082-07:00] [ossmgmt] [NOTIFICATION] [] [ms.hwadapter.osadp.MSLnx1OSAdapterImpl] [tid: 13] [ecid:,0] Error occurred during IBPort population.[[
oracle.ossmgmt.ms.core.MSCell$ExecSageException: CELL-02623: The command "/usr/sbin/ibhosts" returned an error code 255.
    at oracle.ossmgmt.ms.core.MSCell.returnCmd(MSCell.java:2575)
    at oracle.ossmgmt.ms.core.MSCell.returnCmd(MSCell.java:2514)
    at oracle.ossmgmt.ms.core.MSCell.returnCmd(MSCell.java:2486)
    at oracle.ossmgmt.ms.hwadapter.osadp.MSLnx1OSAdapterImpl.populateIBPorts(MSLnx1OSAdapterImpl.java:821)
    at oracle.ossmgmt.ms.hwadapter.osadp.MSOSStatsLinux.getNetStats(MSOSStatsLinux.java:1063)
    at oracle.ossmgmt.ms.core.MSIDBPlanMetricDef.collectNetNicStats(MSIDBPlanMetricDef.java:2150)
    at oracle.ossmgmt.ms.core.MSIDBPlanMetricDef.collect(MSIDBPlanMetricDef.java:1881)
    at oracle.ossmgmt.ms.core.MSIDBPlanMetricTimerTask.run(MSIDBPlanMetricTimerTask.java:89)
    at java.util.TimerThread.mainLoop(Timer.java:512)
    at java.util.TimerThread.run(Timer.java:462)


--->  ifconfig output will NOT reflect the ib0 (or ib1) interface, will show bondib0, but not ib0 (or ib1) similar to:

bondib0   Link encap:Ethernet  HWaddr 00:00:00:00:00:00
         inet addr:  Bcast:  Mask:
         RX packets:0 errors:0 dropped:0 overruns:0 frame:0
         TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
         collisions:0 txqueuelen:0
         RX bytes:0 (0.0 b)  TX bytes:0 (0.0 b)

eth0      Link encap:Ethernet  HWaddr 00:10:E0:0D:8E:DA
         inet addr:  Bcast:  Mask:
         inet6 addr: fe80::210:e0ff:fe0d:8eda/64 Scope:Link
         RX packets:6244 errors:0 dropped:0 overruns:0 frame:0
         TX packets:8425 errors:0 dropped:0 overruns:0 carrier:0
         collisions:0 txqueuelen:1000
         RX bytes:783377 (765.0 KiB)  TX bytes:2721079 (2.5 MiB)

lo        Link encap:Local Loopback
         inet addr:  Mask:
         inet6 addr: ::1/128 Scope:Host
         UP LOOPBACK RUNNING  MTU:16436  Metric:1
         RX packets:13101 errors:0 dropped:0 overruns:0 frame:0
         TX packets:13101 errors:0 dropped:0 overruns:0 carrier:0
         collisions:0 txqueuelen:0
         RX bytes:1681999 (1.6 MiB)  TX bytes:1681999 (1.6 MiB)





Hardware maintenance on system (such as a battery replacement) caused a possible seating issue with the InfiniBand hardware.


Sign In with your My Oracle Support account

Don't have a My Oracle Support account? Click to get started

My Oracle Support provides customers with access to over a
Million Knowledge Articles and hundreds of Community platforms