Hello @Csaba_Zagoni, @Dave @Jon
I’ve been trying to think of a better way to catch the moment that the “thread is dead” problem occurs while also providing automatic recovery.
I think that rather than continue to print the “thread is dead” error message we can force emonhub to close down and then have a watchdog script restart emonhub while saving the last 100 lines of the emonhub log to a dedicated crash log.
For those happy to make modifications via terminal I have outlined the steps to do this below, @Dave it would be great if you could try this. @Csaba_Zagoni I have sent you a PM I would be happy to help you with this if you can give me remote access. If this provides a solution and improved logging of “thread is dead” events from which we can further debug the root cause of the issue then we can push this out as a general update.
Implementation steps:
Open to edit emonhub.py:
sudo nano /home/pi/emonhub/src/emonhub.py
Navigate to line 140 and add “self._exit = True” below the printing of the “thread is dead” error as so:
if not I.isAlive():
self._log.warning(I.name + " thread is dead") # had to be restarted")
self._exit = True
If you cant find this point this link might help:
Restart emonhub at this point
sudo service emonhub restart
We can then add a watchdog that checks if emonhub is running and in the case that it is not restart it.
To create this basic watchdog, create a file watchdog.sh in /home/pi:
rpi-rw
cd
nano watchdog.sh
Paste the following content into that file:
#!/bin/bash
TEST=$( ps aux | grep "python /usr/share/emonhub/emonhub.py --config-file /home/pi/data/emonhub.conf" | wc -l )
LOG=$(tail -n 100 /var/log/emonhub/emonhub.log)
if [ $TEST -lt 2 ]; then
echo "Emonhub is down, restarting!"
sudo service emonhub restart
echo "Last 100 lines of emonhub.log:"
echo "$LOG"
fi
Save and exit
Make it excutable with:
sudo chmod +x watchdog.sh
Then finally add to crontab with:
sudo crontab -e
crontab entry:
* * * * * /home/pi/watchdog.sh >> /home/pi/data/watchdog.log 2>&1