IM35461
(Mike Nelson)
18 March 2019 19:30
30
I guess perhaps 20 days I started on a new microsd card with the latest release 9.8.8 beta and have never made any command line changes.
Guess, what… it stopped updating
I did not have much time and did not want to loose any more data (14hrs lost) so I just selected the reboot option.
I just took the screen dump which seem to report all the services were running…
Any ideas on a cause?
O dear! and I can see you are running all the latest services using systemd with automatic restart. Perhaps it is related to the issue seen by @borpin , his stopped logging yesterday I think, and may be related to logs filling up - it may also be unconnected. @glyn.hudson has pushed a fix today to fix the log issue. You could try pulling that in with an update and seeing if the problem happens again?
IM35461
(Mike Nelson)
18 March 2019 21:25
32
I have applied the update and will see how it goes…
Many thanks.
1 Like
IM35461
(Mike Nelson)
4 April 2019 19:59
33
Just went to check how my one was going and it no longer showed up on the local network. It was working this morning.
The units LCD was displaying WiFi signal level but the button did nothing so I assume a total crash (never had one before).
Unplugged the USB briefly and it is going again…
IM35461
(Mike Nelson)
22 April 2019 09:03
34
Failed again after 18 days
I have left it not logging should anyone want me to check something.
borpin
(Brian Orpin)
22 April 2019 09:17
35
@IM35461 , Can you do a
ls -lah /var/log
and post the output please?
Any rotated files (with a .X where X is a number) can be deleted.
If you use this command -
sudo truncate -s 0 /var/log/[the log file name]
on the largest files listed, you should find the data inputs restart.
I have outlined a process here to change the way the logging happens and then how to manage the log files. If you want to have a go at this the please start a new thread. It would be useful as it would add evidence to my theory.
IM35461
(Mike Nelson)
22 April 2019 09:21
36
borpin:
ls -lah /var/log
Not sure I understand the second bit but the output from the first command is
emonSD-30Oct18
pi@emonpi:~ $ ls -lah /var/log
total 50M
drwxr-xr-x 11 root root 520 Apr 22 06:25 .
drwxr-xr-x 12 root root 4.0K Aug 2 2018 ..
drwxr-xr-x 2 root adm 100 Apr 4 19:21 apache2
-rw-r----- 1 root adm 40K Apr 22 09:17 auth.log
-rw-r--r-- 1 root root 6.3K Apr 4 19:21 boot.log
-rw------- 1 root utmp 0 Apr 22 09:14 btmp
-rw-r----- 1 root adm 25M Apr 22 09:18 daemon.log
-rw-r----- 1 root adm 1.2K Apr 4 19:21 debug
-rw-rw-rw- 1 root root 400 Apr 21 08:43 emoncms.log
drwxr-xr-x 2 pi root 60 Apr 4 19:20 emonpilcd
-rw-r----- 1 root adm 28K Apr 22 05:08 kern.log
drwxr-xr-x 2 pi pi 80 Apr 22 06:25 logrotate
-rw-r----- 1 root adm 24K Apr 22 09:18 messages
drwxr-xr-x 2 mosquitto mosquitto 60 Apr 4 19:21 mosquitto
-rw-rw-rw- 1 root root 0 Apr 4 19:21 mqtt_input.log
drwxr-xr-x 2 mysql adm 60 Apr 4 19:21 mysql
-rw-rw-rw- 1 root root 0 Apr 4 19:21 mysql.log
-rw-rw-rw- 1 root root 0 Apr 4 19:21 ntp_update.log
drwxr-xr-x 2 root root 40 Apr 4 19:21 openhab
drwxr-xr-x 2 root root 40 Apr 4 19:21 openhab2
drwxr-xr-x 2 redis redis 60 Apr 4 19:21 redis
-rw-rw-rw- 1 root root 0 Apr 4 19:21 service-runner.log
drwxr-xr-x 2 root root 60 Apr 4 19:21 supervisor
-rw-r----- 1 root adm 25M Apr 22 09:18 syslog
-rw-r----- 1 root adm 190 Apr 4 19:21 user.log
-rw-rw-r-- 1 root utmp 1.9K Apr 22 09:17 wtmp
pi@emonpi:~ $
IM35461
(Mike Nelson)
22 April 2019 09:29
37
I entered also these two command on what seem to be the big files
pi@emonpi:~ $ sudo truncate -s 0 /var/log/syslog
pi@emonpi:~ $ sudo truncate -s 0 /var/log/daemon.log
Nothing seems to change…
EDIT
Swap file use seems to have been reduced
borpin
(Brian Orpin)
22 April 2019 09:51
38
IM35461:
Nothing seems to change…
I should have said to do
df -h
It would have said /var/log/
was 100%. It should now say much less (in terms of %).
I’d like to see if the data logging can be restored without rebooting.
Could you post a new output of ls -lah /var/log/
please?
You showed the Feeds page before, are there entries in the Inputs page?
Is there anything in the log window of the Administration page?
[edit]All the data may be currently sitting in REDIS. What does this look like for you?
IM35461
(Mike Nelson)
22 April 2019 10:08
39
borpin:
ls -lah /var/log/
The first commands give
emonSD-30Oct18
pi@emonpi:~ $ df -h
Filesystem Size Used Avail Use% Mounted on
/dev/root 3.9G 1.6G 2.2G 42% /
devtmpfs 484M 0 484M 0% /dev
tmpfs 489M 0 489M 0% /dev/shm
tmpfs 489M 50M 440M 11% /run
tmpfs 5.0M 4.0K 5.0M 1% /run/lock
tmpfs 489M 0 489M 0% /sys/fs/cgroup
tmpfs 30M 80K 30M 1% /tmp
tmpfs 50M 2.3M 48M 5% /var/log
tmpfs 1.0M 0 1.0M 0% /var/tmp
/dev/mmcblk0p1 43M 22M 21M 52% /boot
/dev/mmcblk0p3 3.5G 136M 3.2G 5% /home/pi/data
tmpfs 98M 0 98M 0% /run/user/1000
pi@emonpi:~ $
pi@emonpi:~ $ ls -lah /var/log/
total 2.3M
drwxr-xr-x 11 root root 520 Apr 22 06:25 .
drwxr-xr-x 12 root root 4.0K Aug 2 2018 ..
drwxr-xr-x 2 root adm 100 Apr 4 19:21 apache2
-rw-r----- 1 root adm 43K Apr 22 10:00 auth.log
-rw-r--r-- 1 root root 6.3K Apr 4 19:21 boot.log
-rw------- 1 root utmp 0 Apr 22 09:14 btmp
-rw-r----- 1 root adm 1.1M Apr 22 10:00 daemon.log
-rw-r----- 1 root adm 1.2K Apr 4 19:21 debug
-rw-rw-rw- 1 root root 400 Apr 21 08:43 emoncms.log
drwxr-xr-x 2 pi root 60 Apr 4 19:20 emonpilcd
-rw-r----- 1 root adm 28K Apr 22 05:08 kern.log
drwxr-xr-x 2 pi pi 80 Apr 22 06:25 logrotate
-rw-r----- 1 root adm 26K Apr 22 09:27 messages
drwxr-xr-x 2 mosquitto mosquitto 60 Apr 4 19:21 mosquitto
-rw-rw-rw- 1 root root 0 Apr 4 19:21 mqtt_input.log
drwxr-xr-x 2 mysql adm 60 Apr 4 19:21 mysql
-rw-rw-rw- 1 root root 0 Apr 4 19:21 mysql.log
-rw-rw-rw- 1 root root 0 Apr 4 19:21 ntp_update.log
drwxr-xr-x 2 root root 40 Apr 4 19:21 openhab
drwxr-xr-x 2 root root 40 Apr 4 19:21 openhab2
drwxr-xr-x 2 redis redis 60 Apr 4 19:21 redis
-rw-rw-rw- 1 root root 0 Apr 4 19:21 service-runner.log
drwxr-xr-x 2 root root 60 Apr 4 19:21 supervisor
-rw-r----- 1 root adm 1.2M Apr 22 10:00 syslog
-rw-r----- 1 root adm 190 Apr 4 19:21 user.log
-rw-rw-r-- 1 root utmp 2.7K Apr 22 09:59 wtmp
pi@emonpi:~ $
Adminstration Log
LAST ENTRIES ON THE LOG FILE
2019-04-04 19:21:14.505|ERROR|feedwriter.php|Starting feedwriter script
2019-04-04 19:21:17.502|ERROR|feedwriter.php|Starting feedwriter script
2019-04-04 19:50:24.877|WARN|emoncms_mqtt.php|Not connected, retrying connection
2019-04-04 19:50:24.907|WARN|emoncms_mqtt.php|Connecting to MQTT server: Connection Accepted.: code: 0
2019-04-21 08:43:36.807|ERROR|feedwriter.php|Starting feedwriter script
Inputs
I have no extra nodes and on my old image it never created a phantom nodes.
borpin
(Brian Orpin)
22 April 2019 10:14
40
Can you do a screen shot of the services status please?
borpin
(Brian Orpin)
22 April 2019 10:39
42
It seems all the services at least think they are running. The daemon.log
is increasing which implies that emonhub is running.
If you cat /var/log/daemon.log
you should see log entries from emonhub.py. As long as they are not all ERROR entries you are OK.
If you refresh the admin page, does the ‘feedwriter’ entry ever have any entries pending write?
Seems the processing might have stopped.
So try these in order and after each one, see if the data input is restored (wait a couple of minutes)
sudo systemctl restart mosquitto.service
sudo systemctl restart emoncms_mqtt.service
sudo systemctl restart feedwriter.service
sudo systemctl restart apache2.service
pb66
(Paul)
22 April 2019 10:42
43
As a sidenote
The swap file should be no more than an emergency escape lane. It should never be used except in extreme circumstance, the fact the ram usage is just 16% raises concern over how this is being managed. The “swappiness” needs turning down significantly to avoid the swap file being used just to relax the strain on the ram because the swap file is on disk so it adds to the disk writes and sdcard wear.
[edit - Just to note this is a development observation, swappiness is not a user level setting, it is a system setting]
1 Like
IM35461
(Mike Nelson)
22 April 2019 10:52
44
A clip from the log file
Apr 22 10:49:27 emonpi emonhub.py[990]: 2019-04-22 10:49:27,429 DEBUG MQTT Publishing: emon/emonpi/t4 0
Apr 22 10:49:27 emonpi emonhub.py[990]: 2019-04-22 10:49:27,431 DEBUG MQTT Publishing: emon/emonpi/t5 0
Apr 22 10:49:27 emonpi emonhub.py[990]: 2019-04-22 10:49:27,433 DEBUG MQTT Publishing: emon/emonpi/t6 0
Apr 22 10:49:27 emonpi emonhub.py[990]: 2019-04-22 10:49:27,435 DEBUG MQTT Publishing: emon/emonpi/pulsecount 1783
Apr 22 10:49:27 emonpi emonhub.py[990]: 2019-04-22 10:49:27,437 INFO MQTT Publishing: emonhub/rx/5/values 236,0,236,232.54,0,0,0,0,0,0,1783
Apr 22 10:49:29 emonpi emonhub.py[990]: 2019-04-22 10:49:29,102 DEBUG RFM2Pi Discarding RX frame ‘unreliable content’? 16 32 96 16 0 0 96 38 40 2 0 0 32 32 32 32 32 0 0 0 0 (-102)
pi@emonpi:~ $
IM35461
(Mike Nelson)
22 April 2019 11:01
45
This restart fixed it and almost instantly
sudo systemctl restart emoncms_mqtt.service
LAST ENTRIES ON THE LOG FILE
2019-04-04 19:21:14.505|ERROR|feedwriter.php|Starting feedwriter script
2019-04-04 19:21:17.502|ERROR|feedwriter.php|Starting feedwriter script
2019-04-04 19:50:24.877|WARN|emoncms_mqtt.php|Not connected, retrying connection
2019-04-04 19:50:24.907|WARN|emoncms_mqtt.php|Connecting to MQTT server: Connection Accepted.: code: 0
2019-04-21 08:43:36.807|ERROR|feedwriter.php|Starting feedwriter script
2019-04-22 10:59:35.740|WARN|emoncms_mqtt.php|Not connected, retrying connection
2019-04-22 10:59:35.770|WARN|emoncms_mqtt.php|Connecting to MQTT server: Connection Accepted.: code: 0
Back in action for another 18 days?
Greebo
(Greebo)
22 April 2019 11:11
46
/var/log
is 50M
based on the df -h
And there’s two files that were taking up 25M each, so /var/log would definitely have been full.
If you keep an eye on those files and run
pi@emonpi:~ $ sudo truncate -s 0 /var/log/syslog
pi@emonpi:~ $ sudo truncate -s 0 /var/log/daemon.log
before they get to 25M, you should be able to prevent it from stopping.
Maybe set yourself a reminder to do it in 14 days…
1 Like
borpin
(Brian Orpin)
22 April 2019 14:39
47
Great. I am 99.9% sure the issue is with the logs filling up.
To help with the time to fill up,
sudo nano /etc/rsyslog.conf
look for the line like this and make sure the entry daemon.none is there - if not add it.
*.*;auth,authpriv.none,daemon.none -/var/log/syslog
then
sudo systemctl restart rsyslog.service
You should now see just the daemon.log
file increasing in size and syslog
rising far more slowly. This should double the time between failures.
@TrystanLea @glyn.hudson - this needs looking at again. I have a suggested solution. A combination of emonSD next steps: emonhub logging (message 1) to create the log file for emonhub and emonSD next steps: emonhub logging - #13 by borpin (message 13) to rotate it. I am 2 days in of a test of this setup . I’d deploy 2 additional files (1 x monit, 1 x rsyslog) for the emoncms log as well.
IM35461
(Mike Nelson)
22 April 2019 14:40
48
Just for info, I see those two log files are up to 10M in size in just a few hours
That seems an awful lot of logging
emonSD-30Oct18
pi@emonpi:~ $ ls -lah /var/log
total 21M
drwxr-xr-x 11 root root 520 Apr 22 06:25 .
drwxr-xr-x 12 root root 4.0K Aug 2 2018 ..
drwxr-xr-x 2 root adm 100 Apr 4 19:21 apache2
-rw-r----- 1 root adm 58K Apr 22 14:37 auth.log
-rw-r--r-- 1 root root 6.3K Apr 4 19:21 boot.log
-rw------- 1 root utmp 0 Apr 22 09:14 btmp
-rw-r----- 1 root adm 10M Apr 22 14:37 daemon.log
-rw-r----- 1 root adm 1.2K Apr 4 19:21 debug
-rw-rw-rw- 1 root root 584 Apr 22 10:59 emoncms.log
drwxr-xr-x 2 pi root 60 Apr 4 19:20 emonpilcd
-rw-r----- 1 root adm 29K Apr 22 10:49 kern.log
drwxr-xr-x 2 pi pi 80 Apr 22 06:25 logrotate
-rw-r----- 1 root adm 26K Apr 22 10:49 messages
drwxr-xr-x 2 mosquitto mosquitto 60 Apr 4 19:21 mosquitto
-rw-rw-rw- 1 root root 0 Apr 4 19:21 mqtt_input.log
drwxr-xr-x 2 mysql adm 60 Apr 4 19:21 mysql
-rw-rw-rw- 1 root root 0 Apr 4 19:21 mysql.log
-rw-rw-rw- 1 root root 0 Apr 4 19:21 ntp_update.log
drwxr-xr-x 2 root root 40 Apr 4 19:21 openhab
drwxr-xr-x 2 root root 40 Apr 4 19:21 openhab2
drwxr-xr-x 2 redis redis 60 Apr 4 19:21 redis
-rw-rw-rw- 1 root root 0 Apr 4 19:21 service-runner.log
drwxr-xr-x 2 root root 60 Apr 4 19:21 supervisor
-rw-r----- 1 root adm 11M Apr 22 14:37 syslog
-rw-r----- 1 root adm 190 Apr 4 19:21 user.log
-rw-rw-r-- 1 root utmp 4.2K Apr 22 14:37 wtmp
pi@emonpi:~ $
borpin
(Brian Orpin)
22 April 2019 14:42
49
Which I think rsyslog has been using as /var/log/
has been full.