Community
OpenEnergyMonitor

Community

EmonPi stopped updating inputs - /var/log/ full

My EmonPi stopped updating the inputs as of about 5 hrs ago.

I am absolutely certain it was because /var/log/ was full (df -h showed it as 100% use). The last recorded values in the log file stopped at about the right time.

I tried deleting a rotated syslog but that did not kick things into life, all the services seemed fine so I had to reboot and things were of course fine.

Been away (4hrs) and on coming back, a df -h shows

Filesystem      Size  Used Avail Use% Mounted on
/dev/root       3.9G  1.6G  2.2G  42% /
devtmpfs        484M     0  484M   0% /dev
tmpfs           489M     0  489M   0% /dev/shm
tmpfs           489M   50M  440M  11% /run
tmpfs           5.0M  4.0K  5.0M   1% /run/lock
tmpfs           489M     0  489M   0% /sys/fs/cgroup
tmpfs            30M  4.0K   30M   1% /tmp
tmpfs            50M   33M   18M  66% /var/log
tmpfs           1.0M     0  1.0M   0% /var/tmp
/dev/mmcblk0p1   43M   22M   21M  52% /boot
/dev/mmcblk0p3   11G  116M  9.7G   2% /home/pi/data
tmpfs            98M     0   98M   0% /run/user/1000

It looks like the emonhub logging is being duplicated in syslog.

$ ls -la /var/log/
total 28952
drwxr-xr-x 11 root      root           520 Mar 17 09:56 .
drwxr-xr-x 12 root      root          4096 Aug  2  2018 ..
drwxr-xr-x  2 root      adm            100 Mar 17 09:56 apache2
-rw-r-----  1 root      adm          15656 Mar 17 14:50 auth.log
-rw-r--r--  1 root      root          7205 Mar 17 09:56 boot.log
-rw-------  1 root      utmp           768 Mar 17 09:57 btmp
-rw-r-----  1 root      adm       14720248 Mar 17 14:51 daemon.log
-rw-r-----  1 root      adm           1177 Mar 17 09:56 debug
-rw-rw-rw-  1 root      root           328 Mar 17 09:58 emoncms.log
drwxr-xr-x  2 pi        root            60 Mar 17 09:56 emonpilcd
-rw-r-----  1 root      adm          52393 Mar 17 14:51 kern.log
drwxr-xr-x  2 pi        pi              40 Mar 17 09:56 logrotate
-rw-r-----  1 root      adm          40227 Mar 17 14:51 messages
drwxr-xr-x  2 mosquitto mosquitto       60 Mar 17 09:56 mosquitto
-rw-rw-rw-  1 root      root             0 Mar 17 09:56 mqtt_input.log
drwxr-xr-x  2 mysql     adm             60 Mar 17 09:56 mysql
-rw-rw-rw-  1 root      root             0 Mar 17 09:56 mysql.log
-rw-rw-rw-  1 root      root             0 Mar 17 09:56 ntp_update.log
drwxr-xr-x  2 root      root            40 Mar 17 09:56 openhab
drwxr-xr-x  2 root      root            40 Mar 17 09:56 openhab2
drwxr-xr-x  2 redis     redis           60 Mar 17 09:56 redis
-rw-rw-rw-  1 root      root             0 Mar 17 09:56 service-runner.log
drwxr-xr-x  2 root      root            60 Mar 17 09:56 supervisor
-rw-r-----  1 root      adm       14781919 Mar 17 14:51 syslog
-rw-r-----  1 root      adm            396 Mar 17 09:56 user.log
-rw-rw-r--  1 root      utmp          1920 Mar 17 09:57 wtmp

I wondered if it was because I had the logging window open, but it does not appear to be the issue.

We need to investigate log files…

This is an issue I highlighted years ago but it was deemed “unlikely to happen so not worth fixing”.

When you say your input have stopped updating, is that all inputs? Or just those that come from a certian source or via a certain route? For the moment I sorta guess you mean emonhub has stopped passing data to emoncms.

I have seen this problem more than once, but I can confidently say 2 things that may be of interest to you.

  1. When this usually happens (previously), emonhub does not stop processing data. I have found systems that have not been updating logfiles for several months but there has been no interruption in data and as soon as I delete some logs, emonhub has started updating emonhub.log again immediately.

  2. On the one early install I had that used cron jobs to trigger and event twice a day via emonhub, I could occasionally see that the cron triggered inputs alone, had stopped updating and on every single occasion, when I looked into this I found that /var/log was full, no logs were being updated, emonhub was still running perfectly (although unable to write to emonhub.log and found that the cron jobs were not running successfully because syslog could not be written too.

The combination of these 2 facts plus your discovery that the emonhub logs are being duplicated to syslog (I assume the other place as being journalctl) I cannot help but wonder if this has something to do with emonhub writing it’s logs to syslog that has stopped it working as the characteristics of not being able to write to syslog and not being able to write to emonhub.log are apparently different in my experience.

Is this line active in your emonhub.conf?

If so try removing that. If not, I then find myself wondering if this is reklated to the recent changes to emonhub logging, namely the move for init.d service to a systemd unit and the subsequent move to dropping emonhub.log. Just a thought, nothing more than a suggestion for you to try.

Yes. All inputs stopped including a TX and TH.

No there is a big gap in the Feed data

No it is still commented out.

This is basically a stock EmonPi - fully up to date.

That is my thought.

Rebooted at 15:54 and /var/log is at 30% used.

I understood that, but it is (was?) not normal for emonhub to do that, i know without a shadow of a doubt, that emonhub normally ploughs on regardless of a full /var/log implying that something has altered to change that behaviour. What has changed? Oh yeah, emonhub is no longer init.d or logging to emonhub.log! I wonder if that’s pure coincidence or not?

I doubt that it is coincidence! Not many will have done an EmonPi update since that change has gone in.

That’s what I was inferring. When something breaks or changes characteristics all of a sudden, the first place to look is at what has changed that might be related, not many issues occur naturally.

If it does turn out to be linked, this is a prime example of why we shouldn’t change things that work perfectly, just for the hell of it. emonhub.logs have been by far the most reliable source of info for the last few years. I really didn’t want it to be changed.

If it happens again, before doing anything, copy the whole /var/log to disk so that you can trawl through the various files as we start looking in to it. There are (at least) 3 issues to look at here

  1. why emonhub stopped. Did it actually stop? or was it actually mosquitto that stopped passing data or emoncms feedwriter that stopped writing? etc etc etc

  2. Why did the log fill up? Prior to emonhub.log getting axed, the log entries were going there (emonhub.log) which is still part in /var/log so the overall size of /var/log has not actually changed, the log entries are just going to a different file within that folder. So what was over logging to fill it up?

  3. Why is emonhub writing to syslog? IIUC the changes made to the emonhub service unit were to make emonhub write it’s logs to journalctl not syslog so what’s going on there?

It didn’t, the rolling log was still, well rolling, just not writing any of the log to disk and no inputs.

Both were still apparently running but still no data on the input screen. I did see frequent mosquitto disconnections though.

I’ve noted this thread for @glyn.hudson and I too look at in more detail over the coming week.

Im also getting the issue with the log filling up but it does not affect the running of the system otherwise.

So what might stop all inputs EmonPi local, EmonTX and EmonTH (no external MQTT).

Im not sure, I guess we are going to be stuck to debug given the log files where full! It would be good to know if it was emonhub or emoncms_mqtt, or something with mosquitto. I haven’t seen this with the latest image thought so Im not sure.

emonhub, mosquitto or emoncms_mqtt services all need to be fully functional for the data to get from the emonpi/rfm2pi board to emoncms local.

What happens when emoncms cannot write to a log file? Are you 2 running the same inputs page? ie is beta interface on or off?

It continues to run as far as Im aware. In fact my /var/log is full now and its all running ok.

On.

The fact there is a data gap means emoncms really was not processing the input from emonhub, even though it appeared to be running (rolling log on emonhub admin page).

Yes I can confirm the journalctl log continues to show the recent log even with a full /var/log

sudo journalctl -f -u emonhub -n 1000

works fine, suggesting that its in program ram? and not reading from the tmpfs ram

Looking at your ls -la /var/log output in the first post, where are your rotated logs? Is this a brand new image or is logrotate not working?

The reason Glyn set the rotation at hourly was to keep the size down, but it is still easy to fill the log partition in side an hour. You should check the apache folder for oversized files, apache is well known for runaway logs when there’s an issue.

And… where is ufw.log? and ssh.log? UFW can fill up quickly when there is a brute force attack in progress. This is why I moved my ssh ports to a non-standard port number. I had never had a breach, but it was annoying me that brute force attacks would fill up the /var/log with errors, since using a non-standard port and closing 22 in ufw this doesn’t happen.

And why is the mysql.log file not in the mysql folder?

The issue is always daemon.log and syslog. @glyn.hudson Im sure you mentioned what the reason for this might be?

As far as I can see, there is no rotation - all the rotation conf files look stock to me. If you can point me to the conf file that does hourly, happy to check. In addition, I’ve been trying to work out how the syslogs are rotated but so far have come up blank (as there was one when it first went down).

ufw is inactive (remember, stock EmonPi deliberately).

[email protected]:~ $ sudo journalctl -u ssh
-- Logs begin at Sun 2019-03-17 17:25:15 UTC, end at Sun 2019-03-17 20:04:02 UTC. --
Mar 17 17:30:45 emonpi sshd[3141]: Accepted password for pi from 192.168.7.123 port 55065 ssh2
Mar 17 17:30:45 emonpi sshd[3141]: pam_unix(sshd:session): session opened for user pi by (uid=0)

Just to chime in I had a similar issue this afternoon.

Version 9.9.8 and fully up to date.

I started to install node-red. Wouldn’t create log. Checking the log partition was full.

Immediate reboot cleared log and I was able to install node-red. As far as I am aware it caused no other problem.

By default logrotate is run by a daily cron job, this is disabled and an hourly cron job added.

By logrotate usually.

Wow, why would it be inactive on an emonpi by default?

[email protected]:~ $ sudo crontab -l
*/5 * * * * /usr/local/bin/wifi-check > /home/pi/data/wificheck.log 2>&1
[email protected]:~ $ ls -la /etc/cron.hourly/
total 16
drwxr-xr-x  2 root root 4096 Oct 19 15:45 .
drwxr-xr-x 97 root root 4096 Mar 17 14:59 ..
-rwxr-xr-x  1 root root  191 Feb 22  2012 fake-hwclock
-rw-r--r--  1 root root  102 Oct  7  2017 .placeholder

Is all I can see.