How to recover emontx inputs after log full

(Dave Howorth) #19

Thanks. I’d forgotten that page was there. So the answer was INFO. I think I might hack those warnings about old data to be INFO and then turn the logging level down.

Yes, the latter. It continues running after the log space becomes available again so it could log a warning at that point. But then if it recognized that situation, it should also be smart enough to fix whatever actually broke and stopped it working!

It occurred to me that it might not be mqtt_input that really has a problem. It might be mosquitto that is broken, plus the combination being dumb enough to not spot that the input was disconnected when I restarted mosquitto and not reconnect. Either way, it’s pretty obvious there’s some poor logic in there somewhere.

I don’t think I’ve adjusted log rotation at all.

$ ls -la /etc/logrotate.conf 
-rw-r--r-- 1 root root 599 May  5  2015 /etc/logrotate.conf

$ cat /etc/logrotate.conf
# see "man logrotate" for details
# rotate log files weekly

# keep 4 weeks worth of backlogs
rotate 4

# create new (empty) log files after rotating old ones

# uncomment this if you want your log files compressed

# packages drop log rotation information into this directory
include /etc/logrotate.d

# no packages own wtmp, or btmp -- we'll rotate them here
/var/log/wtmp {
    create 0664 root utmp
    rotate 1

/var/log/btmp {
    create 0660 root utmp
    rotate 1

# system-specific logs may be configured here

$ cat /etc/cron.{daily,hourly}/logrotate

test -x /usr/sbin/logrotate || exit 0
/usr/sbin/logrotate /etc/logrotate.conf
cat: /etc/cron.hourly/logrotate: No such file or directory

There is no /var/log/logrotate

-rw-r--r-- 1 root root 1279 May 15 06:25 status

$ cat /var/lib/logrotate/status 
logrotate state -- version 2
"/var/log/syslog" 2019-5-15-6:25:1
"/var/log/dpkg.log" 2018-10-5-6:25:1
"/var/log/mosquitto/mosquitto.log" 2018-10-30-6:25:1
"/var/log/auth.log" 2019-5-12-6:25:2
"/var/log/apt/term.log" 2018-10-5-6:25:1
"/var/log/mysql/mysql-slow.log" 2018-8-7-6:0:0
"/var/log/apt/history.log" 2018-10-5-6:25:1
"/var/log/mysql/error.log" 2019-5-15-6:25:1
"/var/log/alternatives.log" 2018-8-7-6:0:0
"/var/log/debug" 2019-3-17-6:25:1
"/var/log/mail.log" 2018-8-7-6:0:0
"/var/log/kern.log" 2019-3-24-6:25:2
"/var/log/aptitude" 2018-8-7-6:0:0
"/var/log/redis/redis-server.log" 2019-3-24-6:25:2
"/var/log/apache2/access.log" 2019-5-15-6:25:1
"/var/log/ufw.log" 2018-9-19-6:0:0
"/var/log/wtmp" 2019-5-1-6:25:2
"/var/log/daemon.log" 2019-5-12-6:25:2
"/var/log/mail.warn" 2018-8-7-6:0:0
"/var/log/btmp" 2019-5-1-6:25:2
"/var/log/lpr.log" 2018-8-7-6:0:0
"/var/log/mail.err" 2018-8-7-6:0:0
"/var/log/mysql/mysql.log" 2018-8-7-6:0:0
"/var/log/lighttpd/error.log" 2018-8-12-6:25:2
"/var/log/user.log" 2019-3-17-6:25:1
"/var/log/" 2018-8-7-6:0:0
"/var/log/apache2/other_vhosts_access.log" 2018-8-7-6:0:0
"/var/log/apache2/error.log" 2019-1-19-6:25:1
"/var/log/cron.log" 2018-8-7-6:0:0
"/var/log/messages" 2019-5-12-6:25:2
"/var/log/mysql/mariadb-slow.log" 2018-8-7-6:0:0

No, I’m not planning on updating, thanks :wink:

(Dave Howorth) #20

It’s not as simple as that. I just restarted mosquitto again and the input automatically reconnected.

(Brian Orpin) #21

Sorry to say I think that I worked through this with someone else - should have remembered earlier :frowning:

yep Reboot Required Every 20 Days Or So To Restore Logging.

Don’t know why I didn’t remember…

(Paul) #22

I think you’ve got some crossed wires there. The config on the emonhub page is purely for emonhub. Those “old data” messages you refer to are from emoncms not emonhub and that loglevel is set in emoncms/settings.php and not accesible via the emoncms gui.

I’m pretty confident it is, we’ve had issues before, we also saw that the broker was ok as emonhub was able to publish to it, so the issue lay inbetween the working broker and the emoncms inputs, of which, the http ones were updating ok.

That’s good to hear, thanks for sharing that. I’m trying to get a handle on what it and isn’t going on with the emonSD without a valid build guide and it’s a very fluid moving target as the updates and updater are frequently changing.

From your outputs

You are using the original package installed logrotate.conf (dated 2015), this and some other observations confirms what I suspected, that the whole logrotation modification was missed on the oct 2018 image.

So this means you are getting daily checks and weekly rotations not based on size unless overriidden by a package specific conf in /etc/logrotate.d/

although there may not be a /var/log/logrotate/logrotate.status file, the logrotate folder is there in your /var/log printout above, it’s ownership is pi:pi, that tells me it was (most probably) created by /etc/rc.local.

I’ve just been looking at what the logrotation needs for a interim fix and this info is useful, thanks.

Hopefully that won’t always be the case, but either way, I can let you know what I find and recommend with logrotate so you can decide whether to manual adjust or update.

(Brian Orpin) #23

No it looks pretty standard e.g. auth.log rotating weekly.

Pretty sure this is an ‘install’ rather than an SDImage IIRC. In which case the update scripts are not used.

What is the contents of /etc/logrotate.d/. Probably a load of specific configs for various things.

I think if you move the logrotate cron trigger to hourly from daily, and then create the 2 files I list in my issue you should get a stable system (check the directories in the post rotate command).


(Dave Howorth) #24

There are certainly similarities (i.e. logs filling up and stopping the system) but it doesn’t seem to be the same situation as me (different logs, different services etc)

(Dave Howorth) #25

You’re right that I hadn’t realized/remembered they were two different logs. It seems par for the course that the emoncms webUI would provide screen editing for an alien log (emonhub) but not for its own log :roll_eyes:

emoncms.log is set to WARN level, so I’ve hacked PHPTimeSeries.php to make those messages INFO instead of WARN. I suppose I need to restart that somehow for it to take effect? Also, emoncms.log doesn’t seem to get rotated. Where is that done? (possibly a multi-valued answer :grin: )

You’re quite right. Apologies for my mis-statement.

(Dave Howorth) #26

Ifyou’re talking about my system, it is a shop-bought pre-installed SD card.

$ ls /etc/logrotate.d/
apache2  apt  aptitude  dpkg  lighttpd  mosquitto  mysql-server  redis-server  rsyslog  ufw
(Paul) #27

Probably not, emoncms is not a long running service, the script(s) are called for each request.

No apologies due, I wasn’t aware you had mis-stated anything! That was just one of the things I noted under “From your outputs”

(Dave Howorth) #28

Well the WARNings are still being logged, so something is not unloaded between requests or whatever.

(Paul) #29

Mmmmmh! That’s odd, but I guess nothing surprises me much these days.

Where did you do the editing, just to confirm we’re on the same page?

I wasn’t aware of any php caching (server side), I have not needed to restart following code edits before now (aside from browser refresh, which shouldn’t apply here). You could try a sudo systemctl reload apache2, that will do a “graceful restart” which IIRC isn’t actually a restart, it just reloads everything, but AFAIA apache2 doesn’t cache and php caching isn’t installed (but php is not my strong point), perhaps there’s just a shared copy of a included script lingering in ram?

(Dave Howorth) #30

This is the actual edit:

// 2019-05-16 djh  Changed 'warn' to 'info' to stop nuisance logging of DataPoint uploads
$this->log->info("post_bulk_prepare() data in past, nothing saved.  feedid=$feedid timestamp=$timestamp last=".$lastvalue_static_cache[$feedid]['time']." value=$value");

It does a graceful restart every morning as part of the logrotate. Graceful just means that apache allows its threads to run to completion (i.e. finish servicing a request) before terminating and restarting them. It rereads its configuration as well. So it’s a method of completely restarting but with no visible effects on clients.

May 18 06:25:03 emonpi systemd[1]: Reloading The Apache HTTP Server.
May 18 06:25:03 emonpi apachectl[20841]: AH00558: apache2: Could not reliably determine the server's fully qualified domain name, using 127.0.1.
May 18 06:25:03 emonpi systemd[1]: Reloaded The Apache HTTP Server.

Sadly the WARNings are still being logged. I’m out of my depth on PHP as well.

2019-05-18 11:01:35.817|WARN|PHPTimeSeries.php|post_bulk_prepare() data in past, nothing saved.  feedid=50 timestamp=1558166400 last=1558170000 value=1
2019-05-18 11:01:35.818|WARN|PHPTimeSeries.php|post_bulk_prepare() data in past, nothing saved.  feedid=50 timestamp=1558170000 last=1558170000 value=1
(Dave Howorth) #31

Solved it :slight_smile: Entirely by luck.

I reread this thread and had just been looking at the apache log and realized as I reread my explanation about graceful restarts that apache wouldn’t actually restart until all the threads finalised their requests. But emoncms keeps active requests with continual log entries showing it updating lists etc. So I shut down the browser window (on my openSUSE PC, note) that was displaying emoncms pages.

And lo and behold the WARN messages on the emonpi stopped!

So, obvious with hindsight, in order to unload PHP you need to stop accessing the web pages it is serving. Or I suppose a hard reset of apache would do the job.

(Dave Howorth) #32

I spoke too soon. They’re still coming. :frowning:

(Paul) #33

Oh dear! That’s annoying!

Have you tried restarting the feedwriter service?

(Dave Howorth) #34

Not yet. I restarted emonhub after posting my last message, but that doesn’t seem to have fixed it. I’ll try feedwriter now.

(Dave Howorth) #35

Restarting feedwriter seems to have done the trick. :fingers-crossed:

(Paul) #36

I would have been surprised if it had TBH, but the feedwriter calls the functions in PHPTimeseries.php indirectly (via redisbuffer.php IIRC) although I didn’t expect it to be able to hold the old scripts open. I see this as a bit of a issue (not that user will be editing code themselves very often). It sort of undermines the apache “graceful” reload if you must restart feedwriter too, I’m guessing even a full apache2 restart wouldn’t have caused the feedwriter to reload those files. Maybe the feedwriter and emoncms_mqtt services need a “reload” function that can be then called from a single “emoncms_reload” command? my distinction for the “reload” functions is that I’m guessing that each time the feedwriter or emoncms_mqtt services are currently restarted they might risk losing some data that’s in transit or buffered. It’s not a huge problem, but could get annoying if you lose a minute or two worth of data at each push of the update button.

(Dave Howorth) #37

I think apache and feedwriter are two separate issues. In order for apache to reload/restart gracefully, it needs for all page requests to terminate, as they used to do in the good old days. I have a habit of leaving emoncms pages open continuously, which means the daily apache restart has no effect, but I’m not sure what the purpose of that restart is? If I do want to restart apache, I have two choices. Either shut down my browser pages or force a restart of apache. Which is more context to remember but otherwise fairly harmless.

Restarting feedwriter is necessary to make it reread the PHP modules. It isn’t connected to apache restarts. I suppose the good news is that presumably there aren’t any serious memory leaks in feedwriter, since as a permanent daemon it would expose them pretty quickly. I don’t know enough about the system to judge whether restarts can cause data loss; I would hope there’s suitable buffering to prevent it. Perhaps @TrystanLea knows?

(Paul) #38

I think so too, there should be no reason to re(load|start) apache2 to get new code picked up as it doesn’t cache.

Looking at it the service just seems to start the php script and that loops until it is stopped, there doesn’t seem to be any “clean exit” functionality to ensure any buffered data is written before exiting, or in the case of the emoncms_mqtt script, fully processing received MQTT before exiting, it’s only a small thing and I’m not even 100% sure that’s the case, but it would be good to know for sure, even if I am wrong.