Why do multiple pods fail to start or enter CrashLoopBackOff due to DNS errors in ITRS Analytics?
Issue Copied
The ITRS Analytics Web Console may become inaccessible. In the ITRS Analytics debug shell, you may see multiple pods that failed to start or are in CrashLoopBackOff. Pod logs often show errors resolving DNS hostnames.
Check the platformd pod first:
kubectl get pods -n kotsadm | grep platformd
kubectl logs platformd-<pod-id> -n kotsadm
Example DNS error:
java.lang.IllegalArgumentException: Environment variable STATSD_SERVER value 'statsd-internal.kotsadm.svc.cluster.local' is not a valid address or hostname
This indicates that the pod cannot resolve internal Kubernetes service names. Common causes include CoreDNS failures, security software on the node, SELinux, or incorrect kernel parameters.
Resolution Copied
CoreDNS failure Copied
The CoreDNS pod in Kubernetes may have failed or may not be responding correctly. Check the CoreDNS pod status and logs:
kubectl get pods -n kube-system | grep coredns
kubectl logs coredns-<pod-id> -n kube-system
If CoreDNS fails to start or the logs show timeouts or errors, restart the deployment:
kubectl rollout restart deployment/coredns -n kube-system
Antivirus or security tools Copied
Antivirus or endpoint security tools on the node may interfere with Kubernetes networking.
Check whether security software is installed or running on the node. If the problem persists, contact ITRS Support and attach a support bundle.
SELinux Copied
SELinux may be enabled in enforcing mode on the node.
Verify the SELinux status:
sestatus
getenforce
cat /etc/selinux/config
Work with your IT or security team if SELinux settings need to be changed.
Kernel parameters Copied
Unexpected changes to Linux kernel parameters can affect Kubernetes networking.
Verify that net.ipv4.conf.default.rp_filter and net.ipv4.conf.all.rp_filter are set to 2. These parameters affect routing of packets between network interfaces created by Kubernetes.
Embedded Cluster sets rp_filter in /etc/sysctl.d/99-embedded-cluster.conf. The values may have been overridden in another file under /etc/sysctl.d/ or in /etc/sysctl.conf. Search for overrides with:
grep rp_filter -R /etc/sysctl*