Startup & systemd - a personal view

Why do separate logs exist at all? All those processes are run as systemd services, I think, so their output is already logged by journald.

As you say, journald can be configured to write to RAM or to persist to disk/SD. And it will buffer things until the appropriate directories are available. So what are all the manually created log files and directories for?

edit: Just to add that the only need for persistent logs is if there’s some problem causing the whole machine to crash, yes? And in that case, it’s often better to have the log on another machine anyway (or on a serial console in the good old days ITGOD). So why not leave the journal in RAM and then switch the journald mode to persistent if such a problem arises. Or as an additional bonus, if there is a disk (or USB) connected with a partition labelled say JOURNAL-DATA (case-immaterial) or an NFS filesystem available/mounted with a similar name then use that as an auxiliary destination for the journal data?

I suspect because this has grown organically since before systemd was in use. Possibly you are correct and they could be handled a different way. However, I always think switching logging on (efectively) is a bit like shutting the stable door after the horse has bolted. I’d rather a well implemented log rotation but I have no idea how to do that with systemd.

Looks like you have found yourself your first task :slight_smile:

Could be. It seems more promising than other things I’ve looked at.

The journal is always running, so the logs are always available for all problems except those that cause a system crash, which one hopes is rare. I’d be happy to try implementing something that automatically used an external logging device in addition (disk/USB key/NFS) if one is present.

The first exception that springs to mind is emoncms, which is pretty much centre stage here.

I think it’s a bit soon to be looking at getting rid of /var/log altogether and it takes a reasonably sized task and makes it a mammoth task.

There are a lot of users that write custom scripts that run from cron triggers and manually started utilities etc. I know I have dozens. Yes it’s possible to convert them all I guess, but are we really going to take that option away from more novice users that don’t want to write a service? There is a lot of scripts and hacks out there to interconnect IoT stuff and perform various monitoring tasks etc, currently users can add stuff like that without having to totally overhaul it, we need to keep that flexibility IMO.

Well no, not entirely. Many issues with front end hardware and firmware has been resolved by going back through logs such as emonhub.log, sometimes when patterns appear a log history is extremely useful. Especially when debugging emoncms as you can spot an anomaly in a emoncms graph and then look at the raw data processed by emonhub. You get to see the data from the front end and the data that was sent to emoncms.

I think that is a good option, but the emonSD is aimed at being self contained, dependency on a 3rd party site for log info cannot be the only option. And if that doesn’t run smoothly, how would we debug that without some local logs?

I think what you suggest is something to be looked into and possibly implemented in time, but the immediate problem(s) of getting shot of rc.local and improving the logging in RAM (that isn’t going to be totally removed anytime soon) could be done swiftly by simply persisting the logs to disk regularly and reading the existing logs (and folders etc) into ram at boot without muddying the waters with other issues and ambitions. It’s a ready made solution that gives us a big leg up, a vast stride forward with very little effort. That’s not to say that development cannot continue on from there and look closer at more dependence on journalctl. I’m not convinced a shift to journalctl ticks all the issues boxes and offers something that will work for everyone in all circumstances.

I’m talking about the processes that are listed in rc.local. emoncms is not one of them.

I don’t think I suggested that.

All my cron jobs are silent when things are going well and send me email if things go wrong. They never write anything to a log. Are yours different?

You can manually start jobs under systemd too.

I didn’t suggest that either. Everybody is free to keep on doing whatever they want. The point is that the emon-world wouldn’t.

The logs would still be available through journalctl, so I’m not sure what the problem is?

I don’t know the system well enough to comment about this. Personally, I’ve been looking at the actual feed data files and the data reported by the graph module in its CSV output. But in any case to repeat, I’m not suggesting doing away with any log data at all, just changing how it is accessed.

I’m not suggesting any third-party involvement at all. If I have a USB disk or USB key or if I have another machine (like the one I’m typing on now) that can make an NFS export available, then I could export the logs if I chose to.

And again, I am not suggesting doing away with local logs, just not storing them twice, once in the journal and once in a log file.

I think what I’ve suggested does exactly what you ask. It keeps the logs in RAM and it optionally stores them to disk too. And it’s a lot more ready-made than what I think you’re suggesting.

I’ve noted this discussion for @glyn.hudson and I to review next week. I dont know the historical detail on this as it was @glyn.hudson’s work however I noted that in the recent thread Services not restarting on upgrade to 9.9.7 - #41 by Greebo I had to add the entries for restarting service-runner & emoncms_mqtt to the jessie rc.local to ensure those services started at boot while these entries are not present on the strech version of rc.local.

The stretch version of rc.local is much reduced in general, @glyn.hudson can Im sure comment.
https://github.com/openenergymonitor/emonpi/blob/master/rc.local

We are moving in that direction, emonhub.log is now accessed from journalctl

emoncms.log does not use systemd, as it is a mixture of logging coming from the emoncms application running on the apache webserver as well as potentially other scripts.

Im not sure about /var/log/openhab (the final log item listed in the stretch rc.local).

That file you linked is NOT the stretch version, it was last updated in 2015. The jessieminimal version is the current version (the one you edited).

Openhab is no longer installed to emonSD by default but for some backward compatibility it may still be needed. The greater issue is moving forward, implementing any per service edits tie us into a life of always tweaking any new softwares to fit what ever solution emonSD is using, the aim has to be to accommodate packages and softwares without individually modifying them so that users can add stuff without having to come to us to tweak it for them.

I had got the impression that you were by some of your comments, although, no you didn’t specifically say that I guess. Most of my comment were obviously with this assumption in mind, but none the less, I still feel that tackling the root cause of the problems ie the current method of logging in ram, is a better approach than just altering the services that have starting issues due to said root cause.

Other logs (such as emoncms) will still be an issue, not only with starting, but also with persisting for better debugging and not losing everything at a reboot.

I’m not saying the log files shouldn’t be moved to journalctl if thats what is wanted, but it doesn’t tackle the root problem system wide so an addditional fix to the ram logging is still needed and that fix alone, would fix the systemd starting issues (and lose rc.local) so moving the logs to journalctl is not a system wide fix and it becomes unnecessary when a wider fix is applied, that doesn’t mean it shouldn’t be done, just that’s it’s not a priority, just a possible “nice to have” .

But they are not held in 2 places, only the services that have been tweaked to send thier logs to journalctl do so. By default mysql, apache2 and redis etc only have normal logging, heres an example from a stock Debian stretch install. The logs only exist in /var/log, what you suggest is not removing duplicates but simply relocating them.

pb66@test2:~$ journalctl -u mysql
Hint: You are currently not seeing messages from other users and the system.
      Users in groups 'adm', 'systemd-journal' can see all messages.
      Pass -q to turn off this notice.
-- No entries --
pb66@test2:~$ journalctl -q -u redis
pb66@test2:~$ sudo ls -la /var/log/{mysql,redis}
/var/log/mysql:
total 36
drwxr-s---  2 mysql adm  4096 Mar 16 06:25 .
drwxr-xr-x 11 root  root 4096 Mar 16 06:25 ..
-rw-r-----  1 mysql adm     0 Mar 16 06:25 error.log
-rw-r-----  1 mysql adm    20 Mar 15 06:25 error.log.1.gz
-rw-r-----  1 mysql adm    20 Mar 14 06:25 error.log.2.gz
-rw-r-----  1 mysql adm    20 Mar 13 06:25 error.log.3.gz
-rw-r-----  1 mysql adm    20 Mar 12 06:25 error.log.4.gz
-rw-r-----  1 mysql adm    20 Mar 11 06:25 error.log.5.gz
-rw-r-----  1 mysql adm    20 Mar 10 06:25 error.log.6.gz
-rw-r-----  1 mysql adm    20 Mar  9 06:25 error.log.7.gz

/var/log/redis:
total 6100
drwxr-x---  2 redis redis    4096 Mar 10 06:25 .
drwxr-xr-x 11 root  root     4096 Mar 16 06:25 ..
-rw-rw----  1 redis redis 2817401 Mar 16 10:45 redis-server.log
-rw-rw----  1 redis redis  351060 Jan 14 06:24 redis-server.log.10.gz
-rw-rw----  1 redis redis  263921 Jan  6 06:24 redis-server.log.11.gz
-rw-rw----  1 redis redis  351542 Dec 31 06:24 redis-server.log.12.gz
-rw-rw----  1 redis redis  262797 Mar 10 06:24 redis-server.log.1.gz
-rw-rw----  1 redis redis  350019 Mar  4 06:24 redis-server.log.2.gz
-rw-rw----  1 redis redis  265519 Feb 24 06:24 redis-server.log.3.gz
-rw-rw----  1 redis redis  354475 Feb 18 06:24 redis-server.log.4.gz
-rw-rw----  1 redis redis   64552 Feb 10 06:24 redis-server.log.5.gz
-rw-rw----  1 redis redis  200074 Feb  8 19:29 redis-server.log.6.gz
-rw-rw----  1 redis redis  351475 Feb  4 06:24 redis-server.log.7.gz
-rw-rw----  1 redis redis  307853 Jan 27 06:24 redis-server.log.8.gz
-rw-rw----  1 redis redis  264428 Jan 20 06:24 redis-server.log.9.gz
pb66@test2:~$

Ah ok, I missread that then! you are right, my mistake :slight_smile:

I guess that does still leave us with the service restarts not being needed on stretch but needed on jessie

I don’t know what you’re looking at but using the correct services, they most certainly are recorded in the journal:

pi@emonpi:~ $ journalctl -u apache2.service 
-- Logs begin at Thu 2016-11-03 17:16:42 UTC, end at Sat 2019-03-16 12:19:02 UTC
Mar 16 09:23:30 emonpi systemd[1]: Starting The Apache HTTP Server...
Mar 16 09:23:31 emonpi apachectl[1042]: AH00558: apache2: Could not reliably det
Mar 16 09:23:31 emonpi systemd[1]: Started The Apache HTTP Server.
pi@emonpi:~ $ journalctl -u redis-server.service 
-- Logs begin at Thu 2016-11-03 17:16:42 UTC, end at Sat 2019-03-16 12:19:02 UTC
Mar 16 09:23:18 emonpi systemd[1]: Starting Advanced key-value store...
Mar 16 09:23:19 emonpi run-parts[500]: run-parts: executing /etc/redis/redis-ser
Mar 16 09:23:19 emonpi run-parts[623]: run-parts: executing /etc/redis/redis-ser
Mar 16 09:23:19 emonpi systemd[1]: Started Advanced key-value store.
Mar 16 09:23:27 emonpi systemd[1]: Stopping Advanced key-value store...
Mar 16 09:23:28 emonpi run-parts[917]: run-parts: executing /etc/redis/redis-ser
Mar 16 09:23:28 emonpi run-parts[923]: run-parts: executing /etc/redis/redis-ser
Mar 16 09:23:28 emonpi systemd[1]: Stopped Advanced key-value store.
Mar 16 09:23:28 emonpi systemd[1]: Starting Advanced key-value store...
Mar 16 09:23:29 emonpi run-parts[930]: run-parts: executing /etc/redis/redis-ser
Mar 16 09:23:29 emonpi run-parts[938]: run-parts: executing /etc/redis/redis-ser
Mar 16 09:23:29 emonpi systemd[1]: Started Advanced key-value store.
pi@emonpi:~ $ journalctl -u mariadb.service 
-- Logs begin at Thu 2016-11-03 17:16:42 UTC, end at Sat 2019-03-16 12:20:01 UTC
Mar 16 09:23:18 emonpi systemd[1]: Starting MariaDB database server...
Mar 16 09:23:19 emonpi systemd[1]: Stopped MariaDB database server.
Mar 16 09:23:19 emonpi systemd[1]: Starting MariaDB database server...
Mar 16 09:23:22 emonpi mysqld[826]: 2019-03-16  9:23:22 1988657152 [Note] /usr/s
Mar 16 09:23:27 emonpi systemd[1]: Started MariaDB database server.
pi@emonpi:~ $ 

Note that bash-autocompletion fills in the service names given the first few letters. systemctl will tell you that mariadb is the name of the service for mysql if you don’t remember, which I agree is a bit of a faff.

FWIW, I’m talking only about current systems. From what you say, I’m supposing that you maintain emoncms on jessie for as long as jessie is supported? But there’s no reason to support it in the same way, nor to halt development of the current system because of a decision to support previous systems. If I choose to run an older version of a system, I expect it to keep running in pretty much that way until I upgrade to a new release. Indeed, that’s exactly why I use openSUSE Leap rather than Tumbleweed (roughly equivalent to Debian stable versus testing, I suppose).

Ok great, it sounds like maintaining multiple rc.local version would make sense so that we can move forward with better practice for suitable distributions.

I’m not sure what the distinction is, both are systemd and the current Stretch based image is using the same “jessieminimal” rc.local, it’s just the repo file name, it is always named rc.local when in use. The distinction will most likely be pre-this point in development and post-this point regardless of Distro (assuming wheezy is already not supported since that wasn’t systemd and has already been left behind)

Hopefully we will reach a point where the rc.local file, close to it’s current form is only needed for installs “pre-this point” and we can discard, empty or move to a distro stock rc.local, it hopefully it won’t matter which as it will not be used, so when systemd eventually no longer supports rc.local or Distros start omitting it, the emonSD will not be affected.

Whether you update/upgrade previous images to whatever come out of this discussion or not, the distinction, dependency on rc.local and rc.local content will revolve specifically around whether the install/image is post rc.local era or not. Once we know what the outcome is, it can be decided whether it can be rolled out to previous images. Whilst i would love to see an all new image, Glyn has made the commitment some time ago that the current image (at that time) would always be upgrade moving forward. So either that comitment gets broken, we upgrade existing images or we could end up maintaining multiple images.

Sounds good. We could detect the required condition for the older rc.local and make that available where applicable. The update process could install the correct rc.local based on conditions.

It should be possible to avoid multiple images this way? correct me if im wrong.

I’ve just been advised about a journald feature I wasn’t aware of, which has changed how I suggest we proceed slightly.

SyncIntervalSec=
The timeout before synchronizing journal files to disk. After syncing, journal files are placed in the OFFLINE state. Note that syncing is unconditionally done immediately after a log message of priority CRIT, ALERT or EMERG has been logged. This setting hence applies only to messages of the levels ERR, WARNING, NOTICE, INFO, DEBUG. The default timeout is 5 minutes. 

If I’m reading that correctly, it means that journald will work exactly as we would like, straight out of the box. That is, if /var/log/jounal exists, or we set its storage mode to persistent. Then journald will save everything to the disk/sdcard and it will only write to the device every five minutes, except in an emergency.

So all we then need to do is override the native logs of the various applications to somewhere like /dev/null

I’ll do some experiments.

Ok, so what you are displaying there is the log messages created by systemd at start up, they are always in systemctl and the very latest have always been displayed with systemctl status.

What you do not see in your output is the logging created by the softwares themselves, a very different thing, that is going to /var/log for the services you displayed. This is not duplicate logging it is different logging in different places.

This discussion came up years ago about mosquitto and the mqtt_input services and it was decided to keep the 2 seperate as they are different things. Take emonhub for example, that produces a vast amount of helpful messages, therefore if this is going to journalctl rather than /var/log, the systemctl log messages are buried, you will need to use some filtering and regexes perhaps top find the systemd logs within the pages and pages of emonhub log messages. That’s if it’s still there, systemd will by default only keep upto 26mb of total logs, so vital boot up info will be quickly lost if every service is logging verbosely to that 26mb limit, yes it can be changed, but there is no need when the logs are kept separate as you demonstrate with your output, just the start and stops of the service(s) are displayed and retained as the faster moving log info is going to /var/log.

The start and stops are not displayed in my output above purely because the system has been up so long the logs have been rotated out (I only enabled persisting journalctl recently on this test server)

pb66@test2:~$ systemctl status redis
â redis-server.service - Advanced key-value store
   Loaded: loaded (/lib/systemd/system/redis-server.service; enabled; vendor pre
set: enabled)
   Active: active (running) since Tue 2019-02-26 09:25:09 GMT; 2 week
s 4 days ago
     Docs: http://redis.io/documentation,
           man:redis-server(1)
 Main PID: 971 (redis-server)
    Tasks: 3 (limit: 4915)
   CGroup: /system.slice/redis-server.service
           ââ971 /usr/bin/redis-server 127.0.0.1:6379

Warning: Journal has been rotated since unit was started. Log output is incomple
te or unavailable.
pb66@test2:~$ sudo journalctl -u redis-server
-- No entries --
pb66@test2:~$ sudo ls -la /var/log/redis
total 6180
drwxr-x---  2 redis redis    4096 Mar 10 06:25 .
drwxr-xr-x 11 root  root     4096 Mar 16 06:25 ..
-rw-rw----  1 redis redis 2899641 Mar 16 15:06 redis-server.log
-rw-rw----  1 redis redis  351060 Jan 14 06:24 redis-server.log.10.gz
-rw-rw----  1 redis redis  263921 Jan  6 06:24 redis-server.log.11.gz
-rw-rw----  1 redis redis  351542 Dec 31 06:24 redis-server.log.12.gz
-rw-rw----  1 redis redis  262797 Mar 10 06:24 redis-server.log.1.gz
-rw-rw----  1 redis redis  350019 Mar  4 06:24 redis-server.log.2.gz
-rw-rw----  1 redis redis  265519 Feb 24 06:24 redis-server.log.3.gz
-rw-rw----  1 redis redis  354475 Feb 18 06:24 redis-server.log.4.gz
-rw-rw----  1 redis redis   64552 Feb 10 06:24 redis-server.log.5.gz
-rw-rw----  1 redis redis  200074 Feb  8 19:29 redis-server.log.6.gz
-rw-rw----  1 redis redis  351475 Feb  4 06:24 redis-server.log.7.gz
-rw-rw----  1 redis redis  307853 Jan 27 06:24 redis-server.log.8.gz
-rw-rw----  1 redis redis  264428 Jan 20 06:24 redis-server.log.9.gz
pb66@test2:~$

see the last line of the status

Warning: Journal has been rotated since unit was started. Log output is incomple
te or unavailable.

I think you should just drop Jessie support. It is ancient history.

I kind of sympathise with that view, in fact it was my first thought too. But then I noticed (I was actually looking up which was newer - jessie or stretch - I’m not a Debian expert, remember) that jessie is an LTS release that has support until 2020 so there is an argument for maintaining software that runs on it for that long too. But I don’t think there’s any need to enhance it. If somebody wants new features they should upgrade to stretch.

I suppose that Trystan et al have a better understanding of how many people are dependent on jessie for some reason.

I think that might be a moot point in this scope, if we are supporting Stretch as it currently stands, you automatically get support for Jessie, unless you opt for some very latest feature that isn’t in the underlying version of systemd used in Jessie. The question is whether we continue support for Jessie and/or Stretch with restarting in the rc.local as it’s already out there on the Stretch base emonSD .