A while ago we had a strange outage, where the Revolution Pi would no longer have network connectivity, but the control systems would still work properly. Just the HMI and data logging were no longer accessible and we could not VPN into the system.
After some digging (and a bit of luck) we traced it to the DHCP daemon no longer running. The Linux out of memory killer had chosen to kill it. As it turned out, this was a freak combination of a DHCP lease expiring and memory pressure. We had been hammering the system, but I would not expect Linux to choose to kill a critical component instead of something more expendable.
For more information see https://www.baeldung.com/linux/memory-o ... oom-killer by Baeldung.
A bandaid for this problem is to instruct systemd to automatically restart dhcpcd if it is killed. We've followed https://ma.ttias.be/auto-restart-crashe ... e-systemd/ by Mattias Geniar.
Run te following command:
Code: Select all
$ sudo systemctl edit --full dhcpcd.service
Code: Select all
[Unit]
Description=dhcpcd on all interfaces
Wants=network.target
Before=network.target
StartLimitIntervalSec=300 # try starting again after 5 minutes if it failed
StartLimitBurst=5 # stop after 5 attempts
[Service]
Type=forking
PIDFile=/run/dhcpcd.pid
ExecStart=/usr/lib/dhcpcd5/dhcpcd -q -b
ExecStop=/sbin/dhcpcd -x
Restart=on-failure # restart only on failure, not on user command
RestartSec=60s # check every 60 seconds
[Install]
WantedBy=multi-user.target
Alias=dhcpcd5.service