My Oracle Support Banner

OCI Compute - High Performance Computing (HPC) and GPU Host Debug Logs Collection Script In OCI (Doc ID 3039961.1)

Last updated on AUGUST 07, 2024

Applies to:

Oracle Compute Cloud Service - Version N/A to N/A
Information in this document applies to any platform.

Purpose

Bare Metal shape instances like High Performance Computing (HPC) and GPU hosts may fail for various reasons related to GPU hardware, memory, thermal, PCIe cards, RDMA links, or other connectivity and performance issues. To streamline troubleshooting, it's crucial to have a comprehensive log collection process.

This document outlines the script's functionality and its ability to provide a one-stop solution for log collection, making the troubleshooting process more efficient and saving valuable time for all the parties involved. 

Scope

The scope of this script is to provide a seamless and comprehensive log collection method, covering all potential failure areas to collect the necessary debug logs. 

The script's primary focus is to simplify the log collection process, ensuring that GPU customers can provide all relevant information to support/service teams in a single attempt. This minimizes the time and effort required for both customers and support, leading to quicker issue resolution and enhanced overall productivity.

Details

To view full details, sign in with your My Oracle Support account.

Don't have a My Oracle Support account? Click to get started!


In this Document
Purpose
Scope
Details

My Oracle Support provides customers with access to over a million knowledge articles and a vibrant support community of peers and Oracle experts.