Mosquitto won't start on boot after raspbian and emonsd update

Hello,

I had a look to overlayFS but it seems that you cannot “commit” the changes without unmounting the upper fs so it makes it tricky. And I definitely found ramlog too complex.

I finally pushed a new version of the logfile init/backup/restore here on github. It is working much as log2ram

So here are the results

On startup : it restores the backup if any and restore the default logfile “template” if needed.
Every hour and on shutdown, it makes a backup.

I did a test and it seems to work quite nicely. Looking at /var/log/messages :

Mar 14 20:35:32 emonpi rsyslogd: [origin software="rsyslogd" swVersion="8.4.2" x-pid="534" x-info="http://www.rsyslog.com"] start
Mar 14 20:35:32 emonpi kernel: [    0.000000] Booting Linux on physical CPU 0x0
...
Mar 14 20:40:06 emonpi rsyslogd: [origin software="rsyslogd" swVersion="8.4.2" x-pid="534" x-info="http://www.rsyslog.com"] exiting on signal 15.
Mar 14 20:40:16 emonpi rsyslogd: [origin software="rsyslogd" swVersion="8.4.2" x-pid="544" x-info="http://www.rsyslog.com"] start

So it seems :

  • last backup on shutdown occurs after rsyslogd is stopped which may be late enough
  • restore on boot occurs before rsyslogd starts which may be soon enough

I will push that to my live emonpi a see how it goes.
Any comment, correction, improvement is welcome.

If you think it is usefull, I will do a PR !

1 Like

How is this looking on your emonPi @jsroques?

Apologies (again) for the lack of engagement. I haven’t had the time to look at your work in depth and felt it demanded a proper evaluation rather than a “quick look” especially as I do not use the emonSD or anything else from the emonpi repo, so I will have to pick out the bits and build an image to fit them to to try this out properly.

I have come back here today just to add a couple of notes that maybe of interest following on from the “Data not logged locally” discussion.

Firstly it might be worth while having the log-rotation triggered by something other than cron. If the log partition fills up, cron can fail, so the bloated logs will not get rotated out. If (for example) a service (perhaps the same one as the log backups so they are sync’d?) was to time one hour and trigger, in a loop rather than using cron, even if the log partition fills up to the point cron doesn’t work, the files will still get rotated on the hourly timer service. This would have a minor benefit of removing some of the lorotate mods for RO OS as the original daily cron can remain, we do not need to add an hourly one.

Secondly, since this log file management is pretty much all related and all required if and only if the OS is RO, perhaps it could check the log partition usage and trigger a rotation at 90 or 95%, yes theoretically we lose 5-10% of the space but I’m thinking headroom for runaway errors, if it’s only checked every few secs it could fill up before the next check, this then presents the question over whether we need to rotate hourly if we are monitoring the partition size? perhaps just run rotate at 80% full and forget hourly rotations, just back up hourly instead?

These are not finished thoughts or feature requests, just some observations and related thoughts for possible consideration.

Hello @pb66,

I had some trouble with my emonpi. I don’t know why but my data partition is rather small (<200MB). Not such a big deal though but I wanted to solve this before anything else, and did not have the time to do it… So the log stuff is not yet running on my “production” emonpi. I guess I will have to make a backup, flash a new SD, and restore… But May is a good month for bank holidays in France :smile: so I will get some time to work again on this.

Is that a know problem problem with cron ? Or did you experience this ?

But then in case of a crash you would lose all logs. The one hour rotation / backup allows limited loss in case of OS crash.

That being said… In the first place, this was about allowing (dumb) services which are not able the create their own log file to start by creating the files in the ram drive at startup. Aren’t we trying to build something overly complex to backup a few logs ? I mean, if emonpi is to be used for some kind of “mission critical” stuff, then… may be a USB thumb drive, or better, a USB HDD or SSD should be used for all RW partitions.

What do you think ?

Sounds like the image wasn’t expanded, the data partition is less than 200MB to make the compressed size smaller and to make it possible to run on a 4GB sdcard. When you first boot the new image you should run emonSDexpand. Assuming you are using a sdcard larger than 4GB the data partition should grow significantly.

I used to experience it a lot, I have measures in place to avoid it, but to my knowledge there has been no specific fix. and to be honest I have not noticed if the rare occasions I’ve still seen it are on a recent OS or not, so I cannot confirm this is still an issue, only that it was an issue and that I’m not aware of any changes to that.

But the post I previously linked, does seem to confirm it’s still an issue.

Do the rotation and backup need to occur together? I wasn’t suggesting it shouldn’t be backed up hourly, just pondering on whether it needs to be rotated hourly. But it was just that, pondering, wondering what the optimum setup was.

For me the most important part is the backup and retention of the logs. Creating the empty log files and folders is a trivial task that just shouldn’t be done in rc.local, and preferably not use a harcoded list of log files.

The most valuable thing here IMO is the fact that users do not need to check their emonPi is still running every hour just in case any useful info is rotated out of the logs if they’re not quick enough and that they do not need to wait for a forum discussion or somehow copy all thier logs out of the emonPi so that they can try a reboot without fear of losing info to help debug.

Ok… I’m dumb… :roll_eyes: I probably thought this was done because latest Raspbian is doing this on first startup. So I took it for granted… Anyway… it is off topic.Thanks for the tip. Ran it last night, now got almost 4GB

Ah… sorry. Misinterpreted what you wrote. Missed the “just back up hourly instead”…
They are independent anyway (2 distincts cron jobs).

That would not prevent file system from filling up anyway as archiving is done on the very same file system (and archives are not compressed as of today). This would work if we could rotate logs on another fs which is not possible with the current version but may be possible with latest version as @borpin suggested. Next question is : what would be emonpi policy regarding OS / tools upgrade ?

And rethinking about this… if fs is full… :face_with_raised_eyebrow: logrotate will fail anyway as there is no more space

Yep… you’re right. I did have hard time with that while starting this log file stuff as I lost log files on each reboot and make troubleshooting… a bit tricky.

If logrotate allows crossing fs boundary, we could

  • run logrotate every hour, day, on fs space trigger… (which ever seems appropriate) on a persistent fs (data typically). That would
    • prevent space problems on /var/log (but may cause some on target fs. we would limit the risk of bloating persistent fs by limiting number of archive or could we use quotas on a directory ? We definitely don’t want to fill another fs up with logs…)
    • serve as a persistent log history
  • run a backup of current logs (without the history) every minute, hour… so in case of crash, we loose minimum logs
  • run a backup on shutdown so that we keep the most recent history in case of a reboot
  • restore the backup on restart so that we restart logging where we left it

Does that make sense ?

My thinking was to take the trigger out of the hands of cron so that it cannot be blocked by a full log partition. If the new script had an inbuilt timer/sleep and did it’s checks and found the log partition full, it could still rotate out the files when cron might not run. And if it was on it’s own “shorter” timer loop, it could discover the 95% limit much sooner than an hourly rotation cron.

I really must do some proper tests to see if this is still an issue in both Stretch and whatever the emonPi is running after a (full?) update. Although I’m tempted to just develop for the latest OS and assume the emonSD/emonPi will get revised at some point, it seems foolish to develop for an out of date OS.

Already covered in the answer above?

Just to add a note here about the emonSD log files in general, IMO they need tidying up somewhat. Whilst looking into something else I fired up a emonpi (older image) and this was the log folder

pi@emonpi(rw):~$ ls -la /var/log
total 1920
drwxr-xr-x 11 root        root            640 Apr 30 11:17 .
drwxr-xr-x 12 root        root           4096 Nov  7  2016 ..
drwxr-xr-x  2 root        adm              60 Mar  8 17:17 apache2
-rw-r-----  1 root        adm           36044 Apr 30 11:26 auth.log
-rw-r--r--  1 root        root           8997 Apr 30 10:14 boot.log
-rw-------  1 root        utmp              0 Mar  8 17:17 btmp
-rw-r-----  1 root        adm           56899 Apr 30 11:17 daemon.log
-rw-r--r--  1 root        root           1344 Apr 30 10:14 dataplicity.log
-rw-r-----  1 root        adm            1093 Mar  8 17:17 debug
-rw-rw-rw-  1 root        root         318915 Apr 30 11:26 emoncms.log
-rw-rw-rw-  1 root        root        1325267 Apr 30 11:17 emoncms.log.1
drwxr-xr-x  2 emonhub     root             80 Apr 30 11:17 emonhub
drwxr-xr-x  2 pi          root             60 Mar  8 17:17 emonpilcd
-rw-r-----  1 root        adm           32712 Apr 30 11:25 kern.log
drwxr-xr-x  2 pi          pi               80 Apr 30 11:17 logrotate
-rw-r-----  1 root        adm             251 Apr 30 11:17 mail.err
-rw-r-----  1 root        adm             545 Apr 30 11:17 mail.info
-rw-r-----  1 root        adm             545 Apr 30 11:17 mail.log
-rw-r-----  1 root        adm             251 Apr 30 11:17 mail.warn
-rw-r-----  1 root        adm           31619 Apr 30 11:25 messages
drwxr-xr-x  2 mosquitto   mosquitto        60 Apr 30 10:14 mosquitto
-rw-rw-rw-  1 root        root              0 Mar  8 17:17 mqtt_input.log
drwxr-xr-x  2 mysql       adm              60 Mar  8 17:17 mysql
-rw-rw-rw-  1 root        root              0 Mar  8 17:17 mysql.log
-rw-rw-rw-  1 root        root              0 Mar  8 17:17 ntp_update.log
drwxr-xr-x  2 openhab     openhab          80 Apr 30 10:14 openhab
drwxr-xr-x  2 redis       redis            60 Mar  8 17:17 redis
-rw-rw-rw-  1 root        root           1898 Apr 30 11:26 service-runner.log
drwxr-xr-x  2 dataplicity dataplicity      60 Mar  8 17:17 supervisor
-rw-r-----  1 root        adm          100992 Apr 30 11:26 syslog
-rw-r-----  1 root        adm            5373 Apr 30 11:25 ufw.log
-rw-rw-r--  1 root        utmp           2304 Apr 30 10:17 wtmp

IMO there should be a wider use of sub folders eg

emoncms.log
emoncms.log.1
mqtt_input.log (not sure if this is still used)

should be in a emoncms folder (there is an unfinished discussion about emoncms file structure at Move data & settings file to a common location?)

and

service-runner.log
emonpilcd/emonpilcd.log
emonpiupdate.log
wificheck.log

should probably all be in a single “emonsd” subfolder, since the logs will be persisted across reboots, there would no longer be a need to put any logs on the writable “data” partition (unless they are backups) but this isn’t directly the responsibility of the log rotate/persist script, although I guess it is related due to the “template”.