Kubernetes Job Submission
k8s_job_submission.Rmd
library(abba)In support for Kubernetes, abba has several local submission functions to manage interface with a cluster. At a high level, these functions break down into three different categories:
- Submit a job
- Get job status
- Retrieve the log of the job
Functions specifically for interfacing with Kubernetes follow the
naming convention abba_*_k8s_*, and more generally
functions intended for local submission (i.e. not interface with the
abba API) are suffixed with _local().
Submit a Job
Job submission is handled by the function
abba_submit_k8s_job_local(). This is the function used to
submit a job on the API side.
abba_submit_k8s_job_local(
"/home/mike.stackhouse/repos/abba/test_programs/test_program.R",
container = "atoruscontainers.azurecr.io/openval_4.2.1_focal:2023.09.0.02"
)
# job.batch/test-program-fbc3131d-a850-4081-8224-c06ef32821da created
# $job_id
# [1] "test-program-fbc3131d-a850-4081-8224-c06ef32821da"
#
# $batch_id
# [1] "test_program-62e382a9-c1a7-4527-a32d-46600ee9ff80"Note that this returns two fields: job_id and
batch_id. The Job ID is the main field to consider and the
identifier used downstream to interact with Kubernetes. Batch ID is a
configurable identifier to group work within Kubernetes, and this is a
placeholder for intended future functions that will help with the
submission of several programs in a bundle.
Get Job Status
Once the job is running in Kubernetes, abba can poll
its status. This can be done using the function
abba_get_k8s_job_status_local()
abba_get_k8s_job_status_local("test-program-fbc3131d-a850-4081-8224-c06ef32821da")
# $Succeeded
# $Succeeded$Jobs
# $Succeeded$Jobs[[1]]
# $Succeeded$Jobs[[1]]$id
# [1] "test-program-fbc3131d-a850-4081-8224-c06ef32821da-2dbr8"
#
# $Succeeded$Jobs[[1]]$path
# [1] "/home/mike.stackhouse/repos/abba/test_programs/test_program.R"
#
#
#
# $Succeeded$Description
# [1] "All containers in the Pod have terminated in success, and will not be restarted."The job statuses can take on the following values:
| Status | Description |
|---|---|
| Pending | The Pod has been accepted by the Kubernetes cluster, but one or more of the containers has not been set up and made ready to run. This includes time a Pod spends waiting to be scheduled as well as the time spent downloading container images over the network. |
| Running | The Pod has been bound to a node, and all of the containers have been created. At least one container is still running, or is in the process of starting or restarting. |
| Succeeded | All containers in the Pod have terminated in success, and will not be restarted. |
| Failed | All containers in the Pod have terminated, and at least one container has terminated in failure. That is, the container either exited with non-zero status or was terminated by the system. |
| Unknown | For some reason the state of the Pod could not be obtained. This phase typically occurs due to an error in communicating with the node where the Pod should be running. |
The API interface methods will keep watching until they see a status other than “Pending” or “Running”. “Succeeded”, “Failed”, or “Unknown” imply the job is “complete” in some watch.
Get the Log Content
In this context, the “log” refers to the stdout/stderr of the program itself. This returns into a list object with the Job ID and the individual lines written out by the program. Additionally, this could be some sort of failure in the pod. If the R program executes to completion, the content will contain all of the console output from the program itself.
abba_get_k8s_job_log_local("test-program-fbc3131d-a850-4081-8224-c06ef32821da")
# [[1]]
# [[1]]$job_id
# [1] "test-program-fbc3131d-a850-4081-8224-c06ef32821da"
#
# [[1]]$log
# [1] "[1] 4" "Warning message:" "This is a test warning " "La-di-da" Interacting with Kubernetes
Physical submission of work into Kubernetes is done using system
calls to the kubectl utility. Note that for this to
function, the calling user account (importantly, the user account
executing the abba API), must have kubectl
configured for the intended Kubernetes cluster. These local function
will **not* work if kubectl is either not configured or not
authorized.