My emonBase stopped recording data from my emonTxen, presumably because /var/log filled up. They’re currently showing no data for 23 hours.
I’ve deleted some data from the beginning of emoncms.log, which was huge, and deleted some old apache2 and syslog files, so there is now space in the directory but the emonTx inputs are still stale. Feedwriter had exited, so I’ve restarted that and it says it’s caught up.
I’m pretty much at a loss where to look next. Is there documentation of how the input data is processed, or a troubleshooting guide? Alternatively, can anybody suggest what I need to look at next?
edit: a bit more context. The feeds that are imported from elsewhere by scripts (weather and solar readings) are still updating normally, hence why I think it’s something emoncms/emonhub specific.
The emonhub service status was a log that was saying No space left on device up until the time I made some space and nothing after that. So I restarted the service and now the log is up-to-date with startup information:
May 14 11:32:49 emonpi sudo[32128]: pam_unix(sudo:session): session opened for user root by (uid=0)
May 14 11:32:49 emonpi sudo[32128]: pam_unix(sudo:session): session closed for user root
May 14 11:32:49 emonpi sudo[32138]: root : TTY=unknown ; PWD=/ ; USER=root ; COMMAND=/bin/chown -R emonhub /var/log/emonhub
May 14 11:32:49 emonpi sudo[32138]: pam_unix(sudo:session): session opened for user root by (uid=0)
May 14 11:32:49 emonpi sudo[32138]: pam_unix(sudo:session): session closed for user root
May 14 11:32:49 emonpi sudo[32147]: root : TTY=unknown ; PWD=/ ; USER=root ; COMMAND=/bin/chown -R emonhub /var/log/emonhub
May 14 11:32:49 emonpi sudo[32147]: pam_unix(sudo:session): session opened for user root by (uid=0)
May 14 11:32:49 emonpi sudo[32147]: pam_unix(sudo:session): session closed for user root
May 14 11:32:49 emonpi emonhub[32118]: Starting OpenEnergyMonitor emonHub: emonhub has been started ok.
May 14 11:32:49 emonpi systemd[1]: Started LSB: Start/stop emonHub.
but my emonTxen are still not updating.
edit: Oh and all the services on the admin page are shown as running and green except for emonPiLCD but then I don’t have an emonPi.
Are your emonTx’s posting via RFM (rfm2pi receiver) or via WiFi (huzzah modules)?
If it’s the former, check the emonhub logs, they maybe at /var/log/emonhub/emonhub.log if you are still running the init.d script version. Or maybe in journald or systemd or daemon.log if using systemd
service depending on where your setup is at currently update wise!
Whether you can see this log info in emoncms depends on what emoncms version you have and whether it ties in with the emonhub service type (I know it’s a mess!), but to be fair, I think if you are updating via the emonpi updater, the changes should happen simultaneously, but IIRC you aren’t using a Raspbian based OS (correct? or am i getting crossed wires?) so I’m not sure how that translates to your set up.
[edit] also worth checking your inputs in emoncms inputs page are not duplicated, if they are see duplicated inputs · Issue #1198 · emoncms/emoncms · GitHub and the linked threads for info on deleting the duplicates to restore data logging. .
emonhub.log is just full of MQTT publishing messages (incidentally, can I just disable MQTT since I don’t use it?)
systemctl status emonhub just shows the same as I posted above. Nothing new since.
My logging system is a standard emonBase. I was trying to get emoncms running on my openSUSE desktop so I could do some development, but I gave that up when I discovered how unportable it is.
My inputs are fine (there was a mysterious Node 3 that appeared out of nowhere, but I deleted that). The red light on the emonTx is flashing regularly.
So you do have a /var/log/emonhub/emonhub.log (ie it’s not via journald)?
If your emonhub.log is showing MQTT publishing, it must also show the packets arriving via serial from the rfm2pi as emonhub can only publish what it receives, so I assume all is ok in emonhub, ie the rfm packets must be arriving if the’re being publised, but you would need to confirm it’s the right traffic to be sure. Do you recognise the emonTx data passing through emonhub?
Are you saying “my inputs are updating just fine” or “my inputs are fine as there are no duplicates” ?
If the inputs are updating then the issue is not emonhub or MQTT/Mosquitto, it’s within emoncms. That can only be the feedwriter IF the new data is timestamped later than the last recorded update (ref your other thread). What does the feedwriter line on the admin page say? usually 60s interval and ??datapoints buffered to write, or words to that effect.
MQTT is how emonhub passes data to emoncms on the emonSD so unless you have set up your local emoncms as a target for an additional http interfacer in emonhub, you are dependent on MQTT for local data.
This used to get me confused as well. I think this is right for a default install. Personally I still use the HTTP interface locally.
If the Inputs page shows that the data is fresh, but the data on the Feeds page is not being refreshed, then it might be feedwriter (assuming you use it). Even though feedwriter shows it is running I have found before that it isn’t actually doing anything.
sudo systemctl restart feedwriter.service
If the data is still in Redis (so you have not rebooted) it should actually catch up.
As would I if I used the emonSD. I understand publishing MQTT data for the benefit of other services is handy, but just because that data is there for that purpose, it is not (IMO) essential that is the route used to emoncms. It adds another step (MQTT broker) between emonhub and emoncms which adds both complexity and as we have seen so many times, potential for failures (mqtt broker or emoncms mqtt inputs). The appeal of the MQTT was AFAIR the inputs labels, but that came at the cost of timestamps being set closer to source. Now with the device module being promoted to auto-create processing and feeds, the input naming is less of a attraction. The timestamps and ability to buffer data (eg to pause for a backup creation etc) wins hands down for me, let alone the reliability and accuracy benefits.
No, the file contains MQTT entries and only MQTT entries. Like these:
2019-05-15 09:24:52,637 INFO MQTT Publishing: emonhub/rx/8/values 2854,3464,-2823,0,252.22,300,300,300,300,300,300,874001,-32
2019-05-15 09:24:56,582 INFO MQTT Publishing: emonhub/rx/7/values 595,0,0,0,248.35,15.7,300,300,300,300,300,621608,-60
My inputs are NOT updating. There are the correct number of them, except there was the addition of ‘Node 3’. The inputs from the emonTxen currently show an Updated value of 45 hrs. My imported data OTOH is updating regularly as normal.
feedwriter **Active** Running - sleep 60s 0 feed points pending write
All the other srevices say ‘Active Running’
No, sorry, I seem to have confused both you and Paul. My emonTx inputs are NOT updating. feedwriter had stopped when the logs were full so I restarted it after clearing some space. It claims to be fine now.
I’m a bit concerned that there doesn’t seem to be anywhere to look for information to diagnose this problem. Anything to look for in some log or other? Or anything on the admin page?
I’m getting to the point where I’m concerned about the loss of logged data, so I’m beginning to think about rebooting the system to see if that fixes the problem. If I do that we will lose all evidence and be no nearer finding the cause of the problem. Does @TrystanLea or anybody have any other ideas to try first to diagnose this?
So what loglevel are you using? it should be DEBUG for debugging or WARNING for a less verbose output, INFO never really got off the ground. IMO those example lines you gave us should be DEBUG not INFO as they are only good for in-depth debugging, connecting to the broker or subscribing to a topic is INFO, but printing every value of every publish, is DEBUG IMO. But INFO just hasn’t really been utilised much, in practice there are essentially 2 levels, DEBUG = verbose and WARNING = quiet.
Ok, therefore as we know emonhub is publishing, that would suggest the serial data is being passed via MQTT to the Mosquitto broker so that bits ok. (you can check that by setting the emonhub log level to DEBUG)
If you have other inputs functioning ok then feedwriter and core emoncms parts are apparently ok.
I would say the issue lays with the MQTT input to emoncms. It failed when the log was full as it couldn’t place it’s log messages, plus it’s possibly the only service you haven’t restarted. It was this I had in mind when I mentioned duplicated inputs, but the charateristics of the mqtt input crashing due to a full log, may differ to what we have previously seen.
there is information to be found, it’s just that the changes from init.d to systemd, mqtt_input to emoncms_mqtt, the move to emoncms.log for the mqtt, feedwriter and service-runner etc etc all happening at different times but in rapid succession really doesn’t help us know what you might have and where to find stuff. Plus, at the very core of your issue is the fact the logging stopped, that alone may explain the absence of some logs, eg the emoncms mqtt input.
There is no emoncms_input service.
mqtt_input is running. I just restarted it in case. No difference - no wait, I spoke too soon. Everything’s updating now.
If you have the emoncms/config module installed (by default on emonSD) you can check the “emonhub” page, it’s near the top of the emonhub.conf.
Alternatively (I think) it’s also found at /home/pi/data/emonhub.conf and/or /etc/emonhub/emonhub.conf, the latter is the generic location for it, the former is where it was put due to the RO OS (which the latest image isn’t). It should have been symlinked so that the generic path was still valid, but I don’t think that was necessarily the case.
That is pretty much the case for all the OEM services (except pre-systemd emonhub)
Yeah! That’s 'bout the size of it, the issue is there is no room for new log messages, so it cannot leave that error message (even if it existed in code). The issue is that the services are not robust enough to manage that situation. emonhub (with it’s own log) is the exception to that rule, I have tested this thoroughly and emonhub doesn’t fail due to a full /var/log (or at least if it has it’s own direct log it doesn’t).
The content of your /var/log folder raises another question over logrotation. If the emonSD logrotation was in play, those .2.gz, .3.gz and .4.gz files wouldn’t be there. So I assume you’re on a daily “standard” logrotation. I do not recommend updating at the moment as you will not only pull in the emonhub service changes, but you will most likely lose the logrotation altogether as the “hourly fix” doesn’t seem to work at all! can you please post the output of the following
ls -la /etc/logrotate.conf
cat /etc/logrotate.conf
cat /etc/cron.{daily,hourly}/logrotate
and
ls -la /var/log/logrotate
ls -la /var/lib/logrotate/status
cat /var/lib/logrotate/status
Thanks. I’d forgotten that page was there. So the answer was INFO. I think I might hack those warnings about old data to be INFO and then turn the logging level down.
Yes, the latter. It continues running after the log space becomes available again so it could log a warning at that point. But then if it recognized that situation, it should also be smart enough to fix whatever actually broke and stopped it working!
It occurred to me that it might not be mqtt_input that really has a problem. It might be mosquitto that is broken, plus the combination being dumb enough to not spot that the input was disconnected when I restarted mosquitto and not reconnect. Either way, it’s pretty obvious there’s some poor logic in there somewhere.
I don’t think I’ve adjusted log rotation at all.
$ ls -la /etc/logrotate.conf
-rw-r--r-- 1 root root 599 May 5 2015 /etc/logrotate.conf
$ cat /etc/logrotate.conf
# see "man logrotate" for details
# rotate log files weekly
weekly
# keep 4 weeks worth of backlogs
rotate 4
# create new (empty) log files after rotating old ones
create
# uncomment this if you want your log files compressed
#compress
# packages drop log rotation information into this directory
include /etc/logrotate.d
# no packages own wtmp, or btmp -- we'll rotate them here
/var/log/wtmp {
missingok
monthly
create 0664 root utmp
rotate 1
}
/var/log/btmp {
missingok
monthly
create 0660 root utmp
rotate 1
}
# system-specific logs may be configured here
$ cat /etc/cron.{daily,hourly}/logrotate
#!/bin/sh
test -x /usr/sbin/logrotate || exit 0
/usr/sbin/logrotate /etc/logrotate.conf
cat: /etc/cron.hourly/logrotate: No such file or directory