My EmonPi stopped updating the inputs as of about 5 hrs ago.
I am absolutely certain it was because /var/log/ was full (df -h showed it as 100% use). The last recorded values in the log file stopped at about the right time.
I tried deleting a rotated syslog but that did not kick things into life, all the services seemed fine so I had to reboot and things were of course fine.
Been away (4hrs) and on coming back, a df -h shows
This is an issue I highlighted years ago but it was deemed “unlikely to happen so not worth fixing”.
When you say your input have stopped updating, is that all inputs? Or just those that come from a certian source or via a certain route? For the moment I sorta guess you mean emonhub has stopped passing data to emoncms.
I have seen this problem more than once, but I can confidently say 2 things that may be of interest to you.
When this usually happens (previously), emonhub does not stop processing data. I have found systems that have not been updating logfiles for several months but there has been no interruption in data and as soon as I delete some logs, emonhub has started updating emonhub.log again immediately.
On the one early install I had that used cron jobs to trigger and event twice a day via emonhub, I could occasionally see that the cron triggered inputs alone, had stopped updating and on every single occasion, when I looked into this I found that /var/log was full, no logs were being updated, emonhub was still running perfectly (although unable to write to emonhub.log and found that the cron jobs were not running successfully because syslog could not be written too.
The combination of these 2 facts plus your discovery that the emonhub logs are being duplicated to syslog (I assume the other place as being journalctl) I cannot help but wonder if this has something to do with emonhub writing it’s logs to syslog that has stopped it working as the characteristics of not being able to write to syslog and not being able to write to emonhub.log are apparently different in my experience.
Is this line active in your emonhub.conf?
If so try removing that. If not, I then find myself wondering if this is reklated to the recent changes to emonhub logging, namely the move for init.d service to a systemd unit and the subsequent move to dropping emonhub.log. Just a thought, nothing more than a suggestion for you to try.
I understood that, but it is (was?) not normal for emonhub to do that, i know without a shadow of a doubt, that emonhub normally ploughs on regardless of a full /var/log implying that something has altered to change that behaviour. What has changed? Oh yeah, emonhub is no longer init.d or logging to emonhub.log! I wonder if that’s pure coincidence or not?
That’s what I was inferring. When something breaks or changes characteristics all of a sudden, the first place to look is at what has changed that might be related, not many issues occur naturally.
If it does turn out to be linked, this is a prime example of why we shouldn’t change things that work perfectly, just for the hell of it. emonhub.logs have been by far the most reliable source of info for the last few years. I really didn’t want it to be changed.
If it happens again, before doing anything, copy the whole /var/log to disk so that you can trawl through the various files as we start looking in to it. There are (at least) 3 issues to look at here
why emonhub stopped. Did it actually stop? or was it actually mosquitto that stopped passing data or emoncms feedwriter that stopped writing? etc etc etc
Why did the log fill up? Prior to emonhub.log getting axed, the log entries were going there (emonhub.log) which is still part in /var/log so the overall size of /var/log has not actually changed, the log entries are just going to a different file within that folder. So what was over logging to fill it up?
Why is emonhub writing to syslog? IIUC the changes made to the emonhub service unit were to make emonhub write it’s logs to journalctl not syslog so what’s going on there?
Im not sure, I guess we are going to be stuck to debug given the log files where full! It would be good to know if it was emonhub or emoncms_mqtt, or something with mosquitto. I haven’t seen this with the latest image thought so Im not sure.
The fact there is a data gap means emoncms really was not processing the input from emonhub, even though it appeared to be running (rolling log on emonhub admin page).
Looking at your ls -la /var/log output in the first post, where are your rotated logs? Is this a brand new image or is logrotate not working?
The reason Glyn set the rotation at hourly was to keep the size down, but it is still easy to fill the log partition in side an hour. You should check the apache folder for oversized files, apache is well known for runaway logs when there’s an issue.
And… where is ufw.log? and ssh.log? UFW can fill up quickly when there is a brute force attack in progress. This is why I moved my ssh ports to a non-standard port number. I had never had a breach, but it was annoying me that brute force attacks would fill up the /var/log with errors, since using a non-standard port and closing 22 in ufw this doesn’t happen.
And why is the mysql.log file not in the mysql folder?
As far as I can see, there is no rotation - all the rotation conf files look stock to me. If you can point me to the conf file that does hourly, happy to check. In addition, I’ve been trying to work out how the syslogs are rotated but so far have come up blank (as there was one when it first went down).
ufw is inactive (remember, stock EmonPi deliberately).
pi@emonpi:~ $ sudo journalctl -u ssh
-- Logs begin at Sun 2019-03-17 17:25:15 UTC, end at Sun 2019-03-17 20:04:02 UTC. --
Mar 17 17:30:45 emonpi sshd[3141]: Accepted password for pi from 192.168.7.123 port 55065 ssh2
Mar 17 17:30:45 emonpi sshd[3141]: pam_unix(sshd:session): session opened for user pi by (uid=0)