Job status summary
Jobs can fail in various ways.
Here's a summary of what ends up in the server DB
(in all cases, result.outcome is RESULT_OUTCOME_CLIENT_ERROR (3).
what happened | result.exit_status | <message> in result.stderr_out
|
app crashed | status returned by OS | "process got signal N" or similar
|
app called exit(N), N nonzero | N | "[OS error string] - exit code %d (0x%x) " or similar
|
app aborted by client, too much time | ERR_RSC_LIMIT_EXCEEDED (-177) | "Maximum elapsed time exceeded"
|
app aborted by client, too much disk | ERR_RSC_LIMIT_EXCEEDED (-177) | "Maximum disk usage exceeded"
|
app aborted by client, too much RAM | ERR_RSC_LIMIT_EXCEEDED (-177) | "Maximum memory exceeded"
|
abort requested by scheduler | ERR_ABORTED_BY_PROJECT (-221) | "aborted by project - no longer usable"
|
abort if not started requested by scheduler (because past deadline) | ERR_ABORTED_BY_PROJECT (-221) |
|
too many exit(0)s | ERR_TOO_MANY_EXITS (-226) | "too many exit(0)s"
|
abort requested by user | ERR_ABORTED_VIA_GUI (-197) |
|
input file download failed | ERR_RESULT_DOWNLOAD (-186) | "WU download error: %s" or similar
|
output file upload failed | ERR_RESULT_UPLOAD (-187) | description of failed uploads
|
client exiting, config.abort_jobs_on_exit | ERR_ABORTED_ON_EXIT | "aborting on client exit"
|
scheduler acked active job | EXIT_ABORTED_BY_CLIENT (194) | "Got ack for job that's still active"
|
app launch failed | ERR_RESULT_START (-185) | "couldn't start %s: %d"
|