My Oracle Support Banner

Spark-Submit from Oozie Fails with Kerberos Errors: "TGT is expired. Aborting renew thread for <USERNAME>@<BDA_HOSTNAME>.<DOMAIN>@<REALM>] " (Doc ID 2652354.1)

Last updated on JULY 20, 2024

Applies to:

Big Data Appliance Integrated Software - Version 4.14.0 and later
Information in this document applies to any platform.

Symptoms

Oozie Spark-submit jobs on BDA 4.14 or 5.1 clusters may fail with Kerberos errors like:
 

spark.eventLog.dir=<TIMESTAMP>,317 WARN [TGT Renewer for <USER>/<BDA_HOSTNAME>.<DOMAIN>@<REALM>.] security.UserGroupInformation (UserGroupInformation.java:run(1041)) - Exception encountered while running the renewal command for <USER>/<BDA_HOSTNAME>.<DOMAIN>@<REALM>. (TGT end time:<EPOCH_TIMESTAMP>, renewalFailures:
...
ExitCodeException exitCode=1: kinit: Ticket expired while renewing credentials
at org.apache.hadoop.util.Shell.runCommand(Shell.java:604)
at org.apache.hadoop.util.Shell.run(Shell.java:507)
at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:789)
at org.apache.hadoop.util.Shell.execCommand(Shell.java:882)
at org.apache.hadoop.util.Shell.execCommand(Shell.java:865)
at org.apache.hadoop.security.UserGroupInformation$1.run(UserGroupInformation.java:1020)
at java.lang.Thread.run(Thread.java:748)
<TIMESTAMP>,342 ERROR [TGT Renewer for <USER>/<BDA_HOSTNAME>.<DOMAIN>@<REALM>] security.UserGroupInformation (UserGroupInformation.java:run(1026))

Additional errors as shown below may also occur:

 

<TIMESTAMP>,311 WARN org.apache.oozie.action.hadoop.HadoopTokenHelper: SERVER[<OOZIE_HOST>.<DOMAIN>] USER[<USERNAME>] GROUP[-] TOKEN[] APP[<oozie.wf.application.path><JOB_NAME>] JOB[<JOB_ID>-oozie-oozi-W] ACTION[<ACTION_ID>-oozie-oozi-W@<DIRECTORY_PATH>_<DATE>] An error happened while trying to get server principal. Getting it from service principal anyway.
java.lang.IllegalArgumentException: Does not contain a valid host:port authority: yarnRM
at org.apache.hadoop.net.NetUtils.createSocketAddr(NetUtils.java:213)
at org.apache.hadoop.net.NetUtils.createSocketAddr(NetUtils.java:164)
at org.apache.hadoop.net.NetUtils.createSocketAddr(NetUtils.java:153)

...

In such cases, the job can be run interactively in CLI, but fails when running through oozie scheduler.

Cause

To view full details, sign in with your My Oracle Support account.

Don't have a My Oracle Support account? Click to get started!


In this Document
Symptoms
Cause
Solution


My Oracle Support provides customers with access to over a million knowledge articles and a vibrant support community of peers and Oracle experts.