How to diagnose missing resources in SPARC CMT systems (Doc ID 2227574.1)

Last updated on FEBRUARY 08, 2017

Applies to:

Sun SPARC Enterprise T2000 Server - Version All Versions and later
Sun SPARC Enterprise T1000 Server - Version Not Applicable and later
Sun SPARC Enterprise T5120 Server - Version All Versions and later
Sun SPARC Enterprise T5140 Server - Version All Versions and later
Sun SPARC Enterprise T5440 Server - Version All Versions and later
Information in this document applies to any platform.

Goal

Description

In certain circumstances Solaris may not report all system resources as being available as a result of a fault (KM 1483194.1), end users manually disabling components (KM 1643464.1), or due to resource constraints applied by Oracle VM/LDoms as will be discussed in this document.

NOTE : SPARC T7 introduces Memory DIMM sparing which is enabled by default on fully populated systems. Please refer to KM 2037793.1 for more information.

Symptoms

OBP reports less memory than expected;

-> show -d properties -level all -t /SYS type==DIMM fru_name fault_state
Target | Property | Value
-----------------------------+-----------------------------------+---------------------------------------------------
/SYS/MB/CM/CMP/BOB01/CH0/ | fru_name | 16384MB DDR4 SDRAM DIMM
DIMM | |
/SYS/MB/CM/CMP/BOB01/CH0/ | fault_state | OK
DIMM | |
/SYS/MB/CM/CMP/BOB01/CH1/ | fru_name | 16384MB DDR4 SDRAM DIMM
DIMM | |
/SYS/MB/CM/CMP/BOB01/CH1/ | fault_state | OK
DIMM | |
/SYS/MB/CM/CMP/BOB11/CH0/ | fru_name | 16384MB DDR4 SDRAM DIMM
DIMM | |
/SYS/MB/CM/CMP/BOB11/CH0/ | fault_state | OK
DIMM | |
/SYS/MB/CM/CMP/BOB11/CH1/ | fru_name | 16384MB DDR4 SDRAM DIMM
DIMM | |
/SYS/MB/CM/CMP/BOB11/CH1/ | fault_state | OK
DIMM | |
/SYS/MB/CM/CMP/BOB21/CH0/ | fru_name | 16384MB DDR4 SDRAM DIMM
DIMM | |
/SYS/MB/CM/CMP/BOB21/CH0/ | fault_state | OK
DIMM | |
/SYS/MB/CM/CMP/BOB21/CH1/ | fru_name | 16384MB DDR4 SDRAM DIMM
DIMM | |
/SYS/MB/CM/CMP/BOB21/CH1/ | fault_state | OK
DIMM | |
/SYS/MB/CM/CMP/BOB31/CH0/ | fru_name | 16384MB DDR4 SDRAM DIMM
DIMM | |
/SYS/MB/CM/CMP/BOB31/CH0/ | fault_state | OK
DIMM | |
/SYS/MB/CM/CMP/BOB31/CH1/ | fru_name | 16384MB DDR4 SDRAM DIMM
DIMM | |
/SYS/MB/CM/CMP/BOB31/CH1/ | fault_state | OK
DIMM | |
/SYS/MB/CM/CMP/MR0/BOB20/ | fru_name | 16384MB DDR4 SDRAM DIMM
CH0/DIMM | |
/SYS/MB/CM/CMP/MR0/BOB20/ | fault_state | OK
CH0/DIMM | |
/SYS/MB/CM/CMP/MR0/BOB20/ | fru_name | 16384MB DDR4 SDRAM DIMM
CH1/DIMM | |
/SYS/MB/CM/CMP/MR0/BOB20/ | fault_state | OK
CH1/DIMM | |
/SYS/MB/CM/CMP/MR0/BOB30/ | fru_name | 16384MB DDR4 SDRAM DIMM
CH0/DIMM | |
/SYS/MB/CM/CMP/MR0/BOB30/ | fault_state | OK
CH0/DIMM | |
/SYS/MB/CM/CMP/MR0/BOB30/ | fru_name | 16384MB DDR4 SDRAM DIMM
CH1/DIMM | |
/SYS/MB/CM/CMP/MR0/BOB30/ | fault_state | OK
CH1/DIMM | |
/SYS/MB/CM/CMP/MR1/BOB00/ | fru_name | 16384MB DDR4 SDRAM DIMM
CH0/DIMM | |
/SYS/MB/CM/CMP/MR1/BOB00/ | fault_state | OK
CH0/DIMM | |
/SYS/MB/CM/CMP/MR1/BOB00/ | fru_name | 16384MB DDR4 SDRAM DIMM
CH1/DIMM | |
/SYS/MB/CM/CMP/MR1/BOB00/ | fault_state | OK
CH1/DIMM | |
/SYS/MB/CM/CMP/MR1/BOB10/ | fru_name | 16384MB DDR4 SDRAM DIMM
CH0/DIMM | |
/SYS/MB/CM/CMP/MR1/BOB10/ | fault_state | OK
CH0/DIMM | |
/SYS/MB/CM/CMP/MR1/BOB10/ | fru_name | 16384MB DDR4 SDRAM DIMM
CH1/DIMM | |
/SYS/MB/CM/CMP/MR1/BOB10/ | fault_state | OK
CH1/DIMM | |

->

Alternatively check output from 'show components' to confirm if any devices are currently disabled;

-> show components
Target | Property | Value
-----------------------------+-----------------------------------+---------------------------------------------------
/SYS/MB/CM/CMP | current_config_state | Enabled
/SYS/MB/CM/CMP/BOB01 | current_config_state | Enabled
/SYS/MB/CM/CMP/BOB01/CH0 | current_config_state | Enabled
/SYS/MB/CM/CMP/BOB01/CH0/ | current_config_state | Enabled
DIMM | |
.
.
/SYS/MB/USB_CTRL | current_config_state | Enabled
/SYS/MB/XGBE0 | current_config_state | Enabled
/SYS/MB/XGBE1 | current_config_state | Enabled
/SYS/RIO/VIDEO | current_config_state | Enabled

->

In this example we should expect to see 262144MB of main memory as no modules are disabled or faulted, however OBP is reporting just 8GB;

SPARC T7-1, No Keyboard
Copyright (c) 1998, 2016, Oracle and/or its affiliates. All rights reserved.
OpenBoot 4.40.1, 8.0000 GB memory installed, Serial #108074902.
Ethernet address 0:10:e0:71:17:96, Host ID: 86711796.

If already booted to Solaris various tools may report less memory, CPUs or I/O than expected;

# psrinfo
0 on-line since 01/25/2017 15:18:21
1 on-line since 01/25/2017 15:18:23
2 on-line since 01/25/2017 15:18:23
3 on-line since 01/25/2017 15:18:23
4 on-line since 01/25/2017 15:18:23
5 on-line since 01/25/2017 15:18:23
6 on-line since 01/25/2017 15:18:23
7 on-line since 01/25/2017 15:18:23
#

# prtdiag | grep "^Memory size"
Memory size: 8192 Megabytes
#

Verify whether the system is booting from factory default or a custom Oracle VM/LDoms configuration;

-> show /HOST/bootmode/ config

/HOST/bootmode
Properties:
config = someldomconfig <<<

->

This can also be verified from the POST logs during platform initialisation;

2017-01-25 15:12:26 0:00:0> NOTICE: Booting config = someldomconfig

And from within Solaris;

# ldm list-spconfig
factory-default
someldomconfig [current]
#

NOTE : If the system is missing resources when booted from the 'factory-default' configuration please contact Oracle Support for further assistance.

Use ldm to verify domain resource allocation and ensure this matches what the host is reporting as available. Using the same example as above we have 8 CPUs, 8GB memory and all I/O assigned to the primary domain;

# ldm list
NAME STATE FLAGS CONS VCPU MEMORY UTIL NORM UPTIME
primary active -n-cv- UART 8 8G 0.3% 0.3% 21m
#

# ldm list-io
NAME TYPE BUS DOMAIN STATUS
---- ---- --- ------ ------
pci_0 BUS pci_0 primary IOV
pci_1 BUS pci_1 primary IOV
pci_2 BUS pci_2 primary IOV
pci_3 BUS pci_3 primary IOV
pci_4 BUS pci_4 primary IOV
/SYS/MB/PCIE6 PCIE pci_0 primary EMP
/SYS/MB/SASHBA PCIE pci_0 primary OCC
/SYS/MB/PCIE4 PCIE pci_1 primary OCC
/SYS/MB/PCIE5 PCIE pci_1 primary EMP
/SYS/MB/NET0 PCIE pci_2 primary OCC
/SYS/MB/NET2 PCIE pci_2 primary OCC
/SYS/MB/PCIE2 PCIE pci_3 primary EMP
/SYS/MB/PCIE3 PCIE pci_3 primary OCC
/SYS/MB/PCIE1 PCIE pci_4 primary EMP
/SYS/MB/NET0/IOVNET.PF0 PF pci_2 primary
/SYS/MB/NET0/IOVNET.PF1 PF pci_2 primary
/SYS/MB/NET2/IOVNET.PF0 PF pci_2 primary
/SYS/MB/NET2/IOVNET.PF1 PF pci_2 primary
#

Use 'ldm list-devices' to confirm which specific resources are currently unallocated to a domain;

ldm list-devices cpu
ldm list-devices memory
ldm list-devices io

 

Solution

Sign In with your My Oracle Support account

Don't have a My Oracle Support account? Click to get started

My Oracle Support provides customers with access to over a
Million Knowledge Articles and hundreds of Community platforms