I'm trying to use a service account token to authenticate with the API server from a pod, but I can't seem to get the kubelet to mount said token. I'm trying to debug this, but I have ran out of ideas. First, here is some context:
The static pod spec for the API server includes a little customization, but when I compare it to kubeadm defaults it seems I don't stray too far:
---
# Source: k8s-control-plane/templates/kube-apiserver-pod.yaml
apiVersion: v1
kind: Pod
metadata:
namespace: kube-system
name: kube-apiserver
labels:
component: kube-apiserver
tier: control-plane
spec:
priorityClass: system-node-critical
hostNetwork: true
dnsConfig:
nameservers:
- "127.0.0.53"
restartPolicy: Always
securityContext:
readOnlyRootFilesystem: true
containers:
- name: kube-apiserver
image: "registry.k8s.io/kube-apiserver:v1.35.3"
ports:
- name: server
containerPort: 6443
protocol: TCP
command:
- /usr/local/bin/kube-apiserver
args:
- "--bind-address=::"
- "--etcd-servers=https://test0:2379"
- "--etcd-certfile=/run/kubernetes/pki/etcd_client.crt"
- "--etcd-keyfile=/run/kubernetes/pki/etcd_client.key"
- "--etcd-cafile=/run/kubernetes/pki/etcd_server_ca.crt"
- "--kubelet-client-certificate=/run/kubernetes/pki/api_kubelet_client.crt"
- "--kubelet-client-key=/run/kubernetes/pki/api_kubelet_client.key"
- "--kubelet-certificate-authority=/run/kubernetes/pki/kubelet_server_ca.crt"
- "--service-account-signing-key-file=/run/kubernetes/pki/svc_account.key"
- "--service-account-issuer=https://test0:6443"
- "--service-account-key-file=/run/kubernetes/pki/test0_svc_account.pub"
- "--api-audiences=https://test0:6443"
- "--tls-cert-file=/run/kubernetes/pki/api_server.crt"
- "--tls-private-key-file=/run/kubernetes/pki/api_server.key"
- "--client-ca-file=/run/kubernetes/pki/api_client_ca.crt"
- "--advertise-address=192.168.18.3"
- "--allow-privileged=true"
- "--authorization-mode=Node,RBAC"
- "--service-cluster-ip-range=172.16.0.0/16"
- "--enable-admission-plugins=NamespaceLifecycle,NodeRestriction,LimitRanger,ServiceAccount,DefaultStorageClass,ResourceQuota"
volumeMounts:
...
The logs from the kube-apiserver pod report nothing out of the ordinary once they can reach etcd:
I0402 22:16:03.400780 1 options.go:263] external host was not specified, using 192.168.18.3
I0402 22:16:03.402635 1 server.go:150] Version: v1.35.3
I0402 22:16:03.402644 1 server.go:152] "Golang settings" GOGC="" GOMAXPROCS="" GOTRACEBACK=""
I0402 22:16:03.483692 1 shared_informer.go:349] "Waiting for caches to sync" controller="node_authorizer"
I0402 22:16:03.484939 1 shared_informer.go:370] "Waiting for caches to sync"
I0402 22:16:03.485946 1 plugins.go:157] Loaded 14 mutating admission controller(s) successfully in the following order: NamespaceLifecycle,LimitRanger,ServiceAccount,NodeRestriction,TaintNodesByCondition,Priority,DefaultTolerationSeconds,DefaultStorageClass,StorageObjectInUseProtection,RuntimeClass,DefaultIngressClass,PodTopologyLabels,MutatingAdmissionPolicy,MutatingAdmissionWebhook.
I0402 22:16:03.485956 1 plugins.go:160] Loaded 14 validating admission controller(s) successfully in the following order: LimitRanger,ServiceAccount,PodSecurity,Priority,PersistentVolumeClaimResize,RuntimeClass,CertificateApproval,CertificateSigning,ClusterTrustBundleAttest,CertificateSubjectRestriction,NodeDeclaredFeatureValidator,ValidatingAdmissionPolicy,ValidatingAdmissionWebhook,ResourceQuota.
I0402 22:16:03.486040 1 instance.go:240] Using reconciler: lease
I0402 22:16:03.678751 1 handler.go:304] Adding GroupVersion apiextensions.k8s.io v1 to ResourceManager
W0402 22:16:03.678769 1 genericapiserver.go:787] Skipping API apiextensions.k8s.io/v1beta1 because it has no resources.
I0402 22:16:03.681206 1 cidrallocator.go:198] starting ServiceCIDR Allocator Controller
I0402 22:16:03.746820 1 handler.go:304] Adding GroupVersion v1 to ResourceManager
I0402 22:16:03.746952 1 apis.go:112] API group "internal.apiserver.k8s.io" is not enabled, skipping.
W0402 22:16:03.747569 1 logging.go:55] [core] [Channel #95 SubChannel #96]grpc: addrConn.createTransport failed to connect to {Addr: "test0:2379", ServerName: "test0:2379", BalancerAttributes: {"<%!p(pickfirstleaf.managedByPickfirstKeyType={})>": "<%!p(bool=true)>" }}. Err: connection error: desc = "transport: Error while dialing: dial tcp: lookup test0: operation was canceled"
[ Trimmed repeating message ]
W0402 22:16:03.818205 1 logging.go:55] [core] [Channel #187 SubChannel #188]grpc: addrConn.createTransport failed to connect to {Addr: "test0:2379", ServerName: "test0:2379", BalancerAttributes: {"<%!p(pickfirstleaf.managedByPickfirstKeyType={})>": "<%!p(bool=true)>" }}. Err: connection error: desc = "transport: Error while dialing: dial tcp: lookup test0: operation was canceled"
[ Trimmed repeating message ]
I0402 22:16:03.820568 1 apis.go:112] API group "storagemigration.k8s.io" is not enabled, skipping.
W0402 22:16:03.866126 1 logging.go:55] [core] [Channel #251 SubChannel #252]grpc: addrConn.createTransport failed to connect to {Addr: "test0:2379", ServerName: "test0:2379", BalancerAttributes: {"<%!p(pickfirstleaf.managedByPickfirstKeyType={})>": "<%!p(bool=true)>" }}. Err: connection error: desc = "transport: authentication handshake failed: context canceled"
I0402 22:16:03.876371 1 handler.go:304] Adding GroupVersion authentication.k8s.io v1 to ResourceManager
W0402 22:16:03.876385 1 genericapiserver.go:787] Skipping API authentication.k8s.io/v1beta1 because it has no resources.
W0402 22:16:03.876387 1 genericapiserver.go:787] Skipping API authentication.k8s.io/v1alpha1 because it has no resources.
I0402 22:16:03.876587 1 handler.go:304] Adding GroupVersion authorization.k8s.io v1 to ResourceManager
W0402 22:16:03.876593 1 genericapiserver.go:787] Skipping API authorization.k8s.io/v1beta1 because it has no resources.
I0402 22:16:03.876990 1 handler.go:304] Adding GroupVersion autoscaling v2 to ResourceManager
I0402 22:16:03.877327 1 handler.go:304] Adding GroupVersion autoscaling v1 to ResourceManager
W0402 22:16:03.877334 1 genericapiserver.go:787] Skipping API autoscaling/v2beta1 because it has no resources.
W0402 22:16:03.877336 1 genericapiserver.go:787] Skipping API autoscaling/v2beta2 because it has no resources.
I0402 22:16:03.877951 1 handler.go:304] Adding GroupVersion batch v1 to ResourceManager
W0402 22:16:03.877958 1 genericapiserver.go:787] Skipping API batch/v1beta1 because it has no resources.
I0402 22:16:03.878340 1 handler.go:304] Adding GroupVersion certificates.k8s.io v1 to ResourceManager
W0402 22:16:03.878345 1 genericapiserver.go:787] Skipping API certificates.k8s.io/v1beta1 because it has no resources.
W0402 22:16:03.878348 1 genericapiserver.go:787] Skipping API certificates.k8s.io/v1alpha1 because it has no resources.
I0402 22:16:03.878573 1 handler.go:304] Adding GroupVersion coordination.k8s.io v1 to ResourceManager
W0402 22:16:03.878577 1 genericapiserver.go:787] Skipping API coordination.k8s.io/v1beta1 because it has no resources.
W0402 22:16:03.878579 1 genericapiserver.go:787] Skipping API coordination.k8s.io/v1alpha2 because it has no resources.
I0402 22:16:03.878813 1 handler.go:304] Adding GroupVersion discovery.k8s.io v1 to ResourceManager
W0402 22:16:03.878818 1 genericapiserver.go:787] Skipping API discovery.k8s.io/v1beta1 because it has no resources.
I0402 22:16:03.879859 1 handler.go:304] Adding GroupVersion networking.k8s.io v1 to ResourceManager
W0402 22:16:03.879872 1 genericapiserver.go:787] Skipping API networking.k8s.io/v1beta1 because it has no resources.
I0402 22:16:03.880030 1 handler.go:304] Adding GroupVersion node.k8s.io v1 to ResourceManager
W0402 22:16:03.880045 1 genericapiserver.go:787] Skipping API node.k8s.io/v1beta1 because it has no resources.
W0402 22:16:03.880052 1 genericapiserver.go:787] Skipping API node.k8s.io/v1alpha1 because it has no resources.
I0402 22:16:03.880251 1 handler.go:304] Adding GroupVersion policy v1 to ResourceManager
W0402 22:16:03.880262 1 genericapiserver.go:787] Skipping API policy/v1beta1 because it has no resources.
I0402 22:16:03.880658 1 handler.go:304] Adding GroupVersion rbac.authorization.k8s.io v1 to ResourceManager
W0402 22:16:03.880669 1 genericapiserver.go:787] Skipping API rbac.authorization.k8s.io/v1beta1 because it has no resources.
W0402 22:16:03.880675 1 genericapiserver.go:787] Skipping API rbac.authorization.k8s.io/v1alpha1 because it has no resources.
I0402 22:16:03.880809 1 handler.go:304] Adding GroupVersion scheduling.k8s.io v1 to ResourceManager
W0402 22:16:03.880820 1 genericapiserver.go:787] Skipping API scheduling.k8s.io/v1beta1 because it has no resources.
W0402 22:16:03.880826 1 genericapiserver.go:787] Skipping API scheduling.k8s.io/v1alpha1 because it has no resources.
I0402 22:16:03.881460 1 handler.go:304] Adding GroupVersion storage.k8s.io v1 to ResourceManager
W0402 22:16:03.881471 1 genericapiserver.go:787] Skipping API storage.k8s.io/v1beta1 because it has no resources.
W0402 22:16:03.881477 1 genericapiserver.go:787] Skipping API storage.k8s.io/v1alpha1 because it has no resources.
I0402 22:16:03.881785 1 handler.go:304] Adding GroupVersion flowcontrol.apiserver.k8s.io v1 to ResourceManager
W0402 22:16:03.881795 1 genericapiserver.go:787] Skipping API flowcontrol.apiserver.k8s.io/v1beta3 because it has no resources.
W0402 22:16:03.881803 1 genericapiserver.go:787] Skipping API flowcontrol.apiserver.k8s.io/v1beta2 because it has no resources.
W0402 22:16:03.881813 1 genericapiserver.go:787] Skipping API flowcontrol.apiserver.k8s.io/v1beta1 because it has no resources.
I0402 22:16:03.882802 1 handler.go:304] Adding GroupVersion apps v1 to ResourceManager
W0402 22:16:03.882813 1 genericapiserver.go:787] Skipping API apps/v1beta2 because it has no resources.
W0402 22:16:03.882819 1 genericapiserver.go:787] Skipping API apps/v1beta1 because it has no resources.
I0402 22:16:03.883372 1 handler.go:304] Adding GroupVersion admissionregistration.k8s.io v1 to ResourceManager
W0402 22:16:03.883387 1 genericapiserver.go:787] Skipping API admissionregistration.k8s.io/v1beta1 because it has no resources.
W0402 22:16:03.883392 1 genericapiserver.go:787] Skipping API admissionregistration.k8s.io/v1alpha1 because it has no resources.
I0402 22:16:03.883552 1 handler.go:304] Adding GroupVersion events.k8s.io v1 to ResourceManager
W0402 22:16:03.883560 1 genericapiserver.go:787] Skipping API events.k8s.io/v1beta1 because it has no resources.
I0402 22:16:03.884176 1 handler.go:304] Adding GroupVersion resource.k8s.io v1 to ResourceManager
W0402 22:16:03.884190 1 genericapiserver.go:787] Skipping API resource.k8s.io/v1beta2 because it has no resources.
W0402 22:16:03.884200 1 genericapiserver.go:787] Skipping API resource.k8s.io/v1beta1 because it has no resources.
W0402 22:16:03.884205 1 genericapiserver.go:787] Skipping API resource.k8s.io/v1alpha3 because it has no resources.
W0402 22:16:03.885162 1 logging.go:55] [core] [Channel #255 SubChannel #256]grpc: addrConn.createTransport failed to connect to {Addr: "test0:2379", ServerName: "test0:2379", BalancerAttributes: {"<%!p(pickfirstleaf.managedByPickfirstKeyType={})>": "<%!p(bool=true)>" }}. Err: connection error: desc = "transport: Error while dialing: dial tcp: lookup test0: operation was canceled"
I0402 22:16:03.888172 1 handler.go:304] Adding GroupVersion apiregistration.k8s.io v1 to ResourceManager
W0402 22:16:03.888349 1 genericapiserver.go:787] Skipping API apiregistration.k8s.io/v1beta1 because it has no resources.
I0402 22:16:04.046008 1 secure_serving.go:211] Serving securely on [::]:6443
I0402 22:16:04.046056 1 dynamic_cafile_content.go:161] "Starting controller" name="client-ca-bundle::/run/kubernetes/pki/api_client_ca.crt"
I0402 22:16:04.046130 1 customresource_discovery_controller.go:294] Starting DiscoveryController
I0402 22:16:04.046141 1 dynamic_serving_content.go:135] "Starting controller" name="serving-cert::/run/kubernetes/pki/api_server.crt::/run/kubernetes/pki/api_server.key"
I0402 22:16:04.046145 1 local_available_controller.go:156] Starting LocalAvailability controller
I0402 22:16:04.046148 1 cache.go:32] Waiting for caches to sync for LocalAvailability controller
I0402 22:16:04.046186 1 tlsconfig.go:243] "Starting DynamicServingCertificateController"
I0402 22:16:04.046262 1 cluster_authentication_trust_controller.go:459] Starting cluster_authentication_trust_controller controller
I0402 22:16:04.046271 1 shared_informer.go:370] "Waiting for caches to sync"
I0402 22:16:04.046452 1 apf_controller.go:377] Starting API Priority and Fairness config controller
I0402 22:16:04.046475 1 remote_available_controller.go:425] Starting RemoteAvailability controller
I0402 22:16:04.046480 1 cache.go:32] Waiting for caches to sync for RemoteAvailability controller
I0402 22:16:04.046529 1 repairip.go:210] Starting ipallocator-repair-controller
I0402 22:16:04.046536 1 shared_informer.go:349] "Waiting for caches to sync" controller="ipallocator-repair-controller"
I0402 22:16:04.046611 1 dynamic_cafile_content.go:161] "Starting controller" name="client-ca-bundle::/run/kubernetes/pki/api_client_ca.crt"
I0402 22:16:04.046957 1 default_servicecidr_controller.go:110] Starting kubernetes-service-cidr-controller
I0402 22:16:04.046965 1 shared_informer.go:349] "Waiting for caches to sync" controller="kubernetes-service-cidr-controller"
I0402 22:16:04.047355 1 apiservice_controller.go:100] Starting APIServiceRegistrationController
I0402 22:16:04.047369 1 cache.go:32] Waiting for caches to sync for APIServiceRegistrationController controller
I0402 22:16:04.047385 1 controller.go:78] Starting OpenAPI AggregationController
I0402 22:16:04.047408 1 controller.go:113] "Deleting old lease on startup" lease="kube-system/apiserver-robydumo4h6qnj4l4vy7hymjdi"
I0402 22:16:04.047494 1 gc_controller.go:85] "Starting apiserver lease garbage collector"
I0402 22:16:04.047505 1 shared_informer.go:370] "Waiting for caches to sync"
I0402 22:16:04.047521 1 system_namespaces_controller.go:66] Starting system namespaces controller
I0402 22:16:04.047545 1 aggregator.go:185] waiting for initial CRD sync...
I0402 22:16:04.047558 1 controller.go:80] Starting OpenAPI V3 AggregationController
I0402 22:16:04.047593 1 controller.go:127] "Starting legacy_token_tracking_controller"
I0402 22:16:04.047604 1 shared_informer.go:370] "Waiting for caches to sync"
I0402 22:16:04.047625 1 controller.go:142] Starting OpenAPI controller
I0402 22:16:04.047642 1 controller.go:90] Starting OpenAPI V3 controller
I0402 22:16:04.047651 1 naming_controller.go:305] Starting NamingConditionController
I0402 22:16:04.047667 1 nonstructuralschema_controller.go:202] Starting NonStructuralSchemaConditionController
I0402 22:16:04.047681 1 apiapproval_controller.go:196] Starting KubernetesAPIApprovalPolicyConformantConditionController
I0402 22:16:04.047689 1 crd_finalizer.go:273] Starting CRDFinalizer
I0402 22:16:04.053544 1 crdregistration_controller.go:114] Starting crd-autoregister controller
I0402 22:16:04.053557 1 shared_informer.go:349] "Waiting for caches to sync" controller="crd-autoregister"
I0402 22:16:04.083896 1 shared_informer.go:356] "Caches are synced" controller="node_authorizer"
I0402 22:16:04.085005 1 shared_informer.go:377] "Caches are synced"
I0402 22:16:04.085038 1 policy_source.go:248] refreshing policies
E0402 22:16:04.133767 1 controller.go:201] "Failed to ensure lease exists, will retry" err="namespaces \"kube-system\" not found" interval="200ms"
I0402 22:16:04.146560 1 shared_informer.go:377] "Caches are synced"
I0402 22:16:04.146572 1 shared_informer.go:356] "Caches are synced" controller="ipallocator-repair-controller"
I0402 22:16:04.146572 1 cache.go:39] Caches are synced for LocalAvailability controller
I0402 22:16:04.146585 1 apf_controller.go:382] Running API Priority and Fairness config worker
I0402 22:16:04.146587 1 apf_controller.go:385] Running API Priority and Fairness periodic rebalancing process
I0402 22:16:04.146614 1 cache.go:39] Caches are synced for RemoteAvailability controller
I0402 22:16:04.146980 1 shared_informer.go:356] "Caches are synced" controller="kubernetes-service-cidr-controller"
I0402 22:16:04.146989 1 default_servicecidr_controller.go:169] Creating default ServiceCIDR with CIDRs: [172.16.0.0/16]
I0402 22:16:04.147394 1 cache.go:39] Caches are synced for APIServiceRegistrationController controller
I0402 22:16:04.147420 1 handler_discovery.go:451] Starting ResourceDiscoveryManager
I0402 22:16:04.147675 1 shared_informer.go:377] "Caches are synced"
I0402 22:16:04.147702 1 controller.go:667] quota admission added evaluator for: namespaces
I0402 22:16:04.147760 1 shared_informer.go:377] "Caches are synced"
I0402 22:16:04.150194 1 cidrallocator.go:302] created ClusterIP allocator for Service CIDR 172.16.0.0/16
I0402 22:16:04.150382 1 default_servicecidr_controller.go:231] Setting default ServiceCIDR condition Ready to True
I0402 22:16:04.153613 1 shared_informer.go:356] "Caches are synced" controller="crd-autoregister"
I0402 22:16:04.153742 1 aggregator.go:187] initial CRD sync complete...
I0402 22:16:04.153746 1 autoregister_controller.go:144] Starting autoregister controller
I0402 22:16:04.153752 1 cache.go:32] Waiting for caches to sync for autoregister controller
I0402 22:16:04.153754 1 cache.go:39] Caches are synced for autoregister controller
I0402 22:16:04.153887 1 cidrallocator.go:278] updated ClusterIP allocator for Service CIDR 172.16.0.0/16
I0402 22:16:04.336439 1 controller.go:667] quota admission added evaluator for: leases.coordination.k8s.io
I0402 22:16:05.050371 1 storage_scheduling.go:123] created PriorityClass system-node-critical with value 2000001000
I0402 22:16:05.052062 1 storage_scheduling.go:123] created PriorityClass system-cluster-critical with value 2000000000
I0402 22:16:05.052068 1 storage_scheduling.go:139] all system priority classes are created successfully or already exist.
I0402 22:16:05.194145 1 controller.go:667] quota admission added evaluator for: roles.rbac.authorization.k8s.io
I0402 22:16:05.204018 1 controller.go:667] quota admission added evaluator for: rolebindings.rbac.authorization.k8s.io
I0402 22:16:05.249877 1 alloc.go:329] "allocated clusterIPs" service="default/kubernetes" clusterIPs={"IPv4":"172.16.0.1"}
W0402 22:16:05.252167 1 lease.go:265] Resetting endpoints for master service "kubernetes" to [192.168.18.3]
I0402 22:16:05.252440 1 controller.go:667] quota admission added evaluator for: endpoints
I0402 22:16:05.254043 1 controller.go:667] quota admission added evaluator for: endpointslices.discovery.k8s.io
I0402 22:16:06.115580 1 controller.go:667] quota admission added evaluator for: serviceaccounts
I0402 22:16:06.625118 1 controller.go:667] quota admission added evaluator for: daemonsets.apps
I0402 22:16:06.629115 1 controller.go:667] quota admission added evaluator for: deployments.apps
I0402 22:16:11.265957 1 controller.go:667] quota admission added evaluator for: replicasets.apps
I0402 22:16:11.316006 1 cidrallocator.go:278] updated ClusterIP allocator for Service CIDR 172.16.0.0/16
I0402 22:16:11.317782 1 cidrallocator.go:278] updated ClusterIP allocator for Service CIDR 172.16.0.0/16
I0402 22:16:11.564501 1 controller.go:667] quota admission added evaluator for: controllerrevisions.apps
The kubelet registers the node just fine after a little while, schedules pods, then starts to report a crash loop on cilium pods. They report a missing service account token:
time=2026-04-02T22:22:03.757372168Z level=info msg="Cilium Operator" subsys=cilium-operator-generic version="1.19.2 3977f6a1 2026-03-17T13:15:59+00:00 go version go1.25.8 linux/arm64"
time=2026-04-02T22:22:03.758175377Z level=info msg="Certloader TLS watcher disabled" module=operator.operator-infra.operator-metrics config=certloader-server-tls
time=2026-04-02T22:22:03.758613585Z level=error msg="Invoke failed" error="unable to create k8s client rest configuration: open /var/run/secrets/kubernetes.io/serviceaccount/token: no such file or directory" function="shell.registerShell (.../hive/shell/server.go:40)"
time=2026-04-02T22:22:03.758641918Z level=info msg="Stopping hive"
time=2026-04-02T22:22:03.758651418Z level=info msg="Stopped hive" duration=2.167µs
time=2026-04-02T22:22:03.75866396Z level=fatal msg="failed to start: failed to populate object graph: unable to create k8s client rest configuration: open /var/run/secrets/kubernetes.io/serviceaccount/token: no such file or directory"
The pod spec however clearly requests the token be mounted:
apiVersion: v1
kind: Pod
...
spec:
affinity:
podAntiAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
- labelSelector:
matchLabels:
io.cilium/app: operator
topologyKey: kubernetes.io/hostname
automountServiceAccountToken: true
containers:
...
volumeMounts:
- mountPath: /tmp/cilium/config-map
name: cilium-config-path
readOnly: true
- mountPath: /var/run/secrets/kubernetes.io/serviceaccount
name: kube-api-access-t7n6g
readOnly: true
...
serviceAccount: cilium-operator
serviceAccountName: cilium-operator
- name: kube-api-access-t7n6g
projected:
defaultMode: 420
sources:
- serviceAccountToken:
expirationSeconds: 3607
path: token
- configMap:
items:
- key: ca.crt
path: ca.crt
name: kube-root-ca.crt
- downwardAPI:
items:
- fieldRef:
apiVersion: v1
fieldPath: metadata.namespace
path: namespace
In trying to figure out why it isn't happening, I tried a simple spec that should use the default service account in the default namespace, and found the same exact issue:
---
apiVersion: v1
kind: Pod
metadata:
name: debug
spec:
hostNetwork: true
containers:
- name: alpine
image: docker.io/library/alpine:3.23.3
command:
- /bin/sh
args:
- '-c'
- 'sleep 3600'
volumeMounts:
- name: service-account
mountPath: /var/run/secrets/kubernetes.io/serviceaccount
readOnly: true
volumes:
- name: service-account
projected:
sources:
- serviceAccountToken:
path: token
expirationSeconds: 3600
When running the following, I see nothing:
kubectl exec debug -n kube-system -- ls /var/run/secrets/kubernetes.io/serviceaccount
So it's definitely not just cilium. Still, there are no relevant errors or warnings from the container runtime, api server or kubelet.
I figured I should check whether I can get a token myself, and it seems to work fine:
% kubectl create token default | jwt decode -
Token header
------------
{
"alg": "RS256",
"kid": "u_uB8MGg8_YuQidjYC1dm4Fsk22AKLY2627ZT1VQaZs"
}
Token claims
------------
{
"aud": [
"https://test0:6443"
],
"exp": 1775172944,
"iat": 1775169344,
"iss": "https://test0:6443",
"jti": "9836324e-5b6b-4bb3-ac36-472f871e482d",
"kubernetes.io": {
"namespace": "default",
"serviceaccount": {
"name": "default",
"uid": "1f8b429d-2eed-42c5-874b-c256a249395a"
}
},
"nbf": 1775169344,
"sub": "system:serviceaccount:default:default"
}
This looks exactly as I expect. I was worried about the audience and issuer not being right, but this is the correct hostname of the node, and the port points to the API server.
Then I thought maybe the NodeRestriction or Node authorization plugins were prohibiting the kubelet from getting a token on behalf of the pod, but it seems to work just fine when I impersonate it:
% kubectl create token default --as=system:node:test0 --as-group=system:nodes --bound-object-kind=Pod --bound-object-name=debug --bound-object-uid=fbbb2fa2-cdd3-4b57-8bb9-8d987e02c026 | jwt decode -
Token header
------------
{
"alg": "RS256",
"kid": "u_uB8MGg8_YuQidjYC1dm4Fsk22AKLY2627ZT1VQaZs"
}
Token claims
------------
{
"aud": [
"https://test0:6443"
],
"exp": 1775173540,
"iat": 1775169940,
"iss": "https://test0:6443",
"jti": "81329c1e-fbe3-41f0-a4be-976634f300da",
"kubernetes.io": {
"namespace": "default",
"node": {
"name": "test0",
"uid": "f67f852f-988b-4935-8a42-38afc0418105"
},
"pod": {
"name": "debug",
"uid": "fbbb2fa2-cdd3-4b57-8bb9-8d987e02c026"
},
"serviceaccount": {
"name": "default",
"uid": "1f8b429d-2eed-42c5-874b-c256a249395a"
}
},
"nbf": 1775169940,
"sub": "system:serviceaccount:default:default"
}
So the private and public keys are clearly valid, since I can get tokens. Authentication and authorization seem okay. The pod spec is valid. The only thing remaining is my API server configuration, but I would imagine I would get something in the logs if I was doing it completely wrong?
There is a haproxy static pod that the kubelet communicates through to reach the (future) other nodes in the control plane. I tried skipping it entirely and going straight to the API server, but nothing changed. Changing the issuer/audience to the load balancer address & port also had no effect.
This last point is related to the use of test0_svc_account.pub in the kube-apiserver pod spec. I want each control plane node to have its own public/private key pair for service account signing, meaning I have to tell the API server to trust all of them via --service-account-key-file= , and list the issuers in --api-audiences= . I also tried doing away with this entirely and only using one key pair (omitting --api-audiences), but nothing changed.
And finally, this is from an offline test environment in a VM, so no internet access. The kubelet & static pods are instructed to use the systemd-resolved local stub resolver on 127.0.0.53, which resolves the machine's hostname to its "public" address (that I can reach from the host), then I hairpin any packets to that address via nftables to 127.0.0.1. I tried binding and pointing all control plane components and the kubelet to loopback only, but nothing changed.
I am looking for suspicious things you found in here, or general debugging tips for service account tokens & kubelet projected volumes. Thanks!