Community
OpenEnergyMonitor

Community

Reboot Required Every 20 Days Or So To Restore Logging

(Bjorn Nuyttens) #14

I’ve added

  • status of all services
  • top printout

You’re right about OpenHAB. Retrieving the data from EmonCMS will be much more reliable. @pb66 Good idea! Will make the watchdog less nervous at night :slight_smile:

0 Likes

(Bjorn Nuyttens) #15

The script pulling data from EmonCMS API is ready. It checks when the temperature, power1 and power2 feeds have been last updated. If any of these feeds have been update more than 5 minutes ago, it logs a bunch of data.

The waiting game has begun. I hope it crashes soon :wink:

1 Like

(Bjorn Nuyttens) #16

@IM35461 I’ve updated to low-write 9.8.28 | 2018.01.27 about a week ago and so far so good. Looks like the update fixed whatever the root cause was. Based on that experience, I’d recommend you install this update to fix the issue you’re having.

0 Likes

(Mike Nelson) #17

Just to let you know it has been working well till yesterday and failed at 70+ days.

Did a reboot and installed an update and now have a new problem. The feeds web page just gives the loading cogwheel but the other webpages and the Android app still work.
Tried cold boot but no better:(

Any ideas?

0 Likes

(Andrew Findlow) #18

I’ve updated as you have suggested and after 21 days the feeds stop logging.
In the server information next to writter (in red) “Daemon is not running, start it at ~/scripts/feedwriter” (see attached)
How do I resolve this issue?

0 Likes

(Paul) #19

Have you restarted the feedwiriter?

sudo systemctl restart feedwriter

It won’t fix the root issue but it will either return a fault/error that might give you insight to why it stopped running or it will start and write all that buffered data to disk before it is lost through a reboot or power outage as you apparently have “1012788 feed points pending write” held in RAM.

0 Likes

(Andrew Findlow) #20

No, I had not tried that, I’m afraid I’ve already rebooted the system so I’ll try that next time it happens (I assume it will do this again in approximately 21 days).
Any ideas what the root problem could be, which is causing these feed data to remain in the buffer and not write to disk?

0 Likes

(Brian Orpin) #21

You need to take a copy of the current logs before rebooting.

Also looking at the output of

sudo systemctl status feedwriter

should give you the last few log entries for the service and may well tell you why it failed. It will also tell you exactly when it failed.

0 Likes

(Andrew Findlow) #22

Paul and Brian,
Thanks for the help, I’m relatively new to monitoring at this level so when I’ve got the logs (BTW, exactly which log do I need?) and the output of the command line where do I begin sifting through these data? Is there somewhere that I can get information on how to interpret the log; I’ve had a cursory look at the EmonHub log and don’t know where to begin etc (‘needle and haystack’ spring to mind)?

0 Likes

(Brian Orpin) #23

If you try the status command now, you should see a ‘normal’ output. It is just a few lines of the log in reality.

Couple of examples (I do not use feedwriter) - this is a service that I run that I know is not working. You can see it says active (exited). Looking at the log entries, it looks like it lost network (socket error) in this case…

[email protected]:~# systemctl status helios.service
● helios.service - LSB: Start/stop heliosd
   Loaded: loaded (/etc/init.d/helios)
   Active: active (exited) since Wed 2018-09-19 12:38:31 BST; 4 days ago
  Process: 583 ExecStart=/etc/init.d/helios start (code=exited, status=0/SUCCESS)

Sep 19 12:38:56 OrPi1 helios[583]: multiple([msg], hostname, port, client_id, keepalive, will, auth, tl...port)
Sep 19 12:38:56 OrPi1 helios[583]: File "/usr/local/lib/python2.7/dist-packages/paho/mqtt/publish.py", ...tiple
Sep 19 12:38:56 OrPi1 helios[583]: client.connect(hostname, port, keepalive)
Sep 19 12:38:56 OrPi1 helios[583]: File "/usr/local/lib/python2.7/dist-packages/paho/mqtt/client.py", l...nnect
Sep 19 12:38:56 OrPi1 helios[583]: return self.reconnect()
Sep 19 12:38:56 OrPi1 helios[583]: File "/usr/local/lib/python2.7/dist-packages/paho/mqtt/client.py", l...nnect
Sep 19 12:38:56 OrPi1 helios[583]: sock = socket.create_connection((self._host, self._port), source_add..., 0))
Sep 19 12:38:56 OrPi1 helios[583]: File "/usr/lib/python2.7/socket.py", line 571, in create_connection
Sep 19 12:38:56 OrPi1 helios[583]: raise err
Sep 19 12:38:56 OrPi1 helios[583]: socket.error: [Errno 113] No route to host
Hint: Some lines were ellipsized, use -l to show in full.

This is my Emonhub - looks happy active (running). Note you can see where the actual log file is located.

[email protected]:~# systemctl status emonhub.service
● emonhub.service - LSB: Start/stop emonHub
   Loaded: loaded (/etc/init.d/emonhub)
   Active: active (running) since Wed 2018-09-19 12:38:46 BST; 4 days ago
  Process: 1058 ExecStart=/etc/init.d/emonhub start (code=exited, status=0/SUCCESS)
   CGroup: /system.slice/emonhub.service
           └─1069 python /usr/share/emonhub/emonhub.py --logfile /var/log/emonhub/emonhub.log --config-file ...

Sep 19 12:38:46 OrPi1 emonhub[1058]: Starting OpenEnergyMonitor emonHub: emonhub has been started ok.
Sep 19 12:38:46 OrPi1 systemd[1]: Started LSB: Start/stop emonHub.

Actually, in this case of yours, the error you are seeing is nothing to do with the emonhub. The 2 logs you need to look at are the feedwriter log and the main emoncms log. If the Feedwriter log tells you when the service stopped, you can check that time stamp on the main log and see if there are any corresponding entries.

0 Likes

(Andrew Findlow) #24

Thanks for that. I’ll try these suggestions when it next fails, and report back

0 Likes

(Andrew Findlow) #25

Sorry forgot to upload out from status command:

● feedwriter.service - LSB: feedwriter script daemon
   Loaded: loaded (/etc/init.d/feedwriter)
   Active: active (running) since Sat 2018-09-22 06:19:16 UTC; 1 weeks 0 days ago
   CGroup: /system.slice/feedwriter.service
           └─1833 /usr/bin/php -f /var/www/emoncms/scripts/feedwriter.php

Warning: Journal has been rotated since unit was started. Log output is incomplete or unavailable.
0 Likes

(Andrew Findlow) #26

It’s been running for 31 days so far and the error has not occurred again and I have not needed to reboot etc (fingers crossed).
Thanks for your assistance and hopefully it does not reoccur anytime in the future.

0 Likes

(Andrew Findlow) #27

How do I turn on the feedwriter log, because when I restart the feed writer and then get the status i’m informed that the log is:

Log is turned off

[email protected](ro):~$ sudo systemctl status feedwriter
● feedwriter.service - LSB: feedwriter script daemon
Loaded: loaded (/etc/init.d/feedwriter)
Active: active (exited) since Thu 2018-12-20 08:27:37 UTC; 1 months 14 days ago
Process: 8484 ExecStop=/etc/init.d/feedwriter stop (code=exited, status=0/SUCCESS)
Process: 15768 ExecStart=/etc/init.d/feedwriter start (code=exited, status=0/SUCCESS)

Warning: Journal has been rotated since unit was started. Log output is incomplete or unavailable.
[email protected](ro):~$ sudo systemctl restart feedwriter
[email protected](ro):~$ sudo systemctl status feedwriter
● feedwriter.service - LSB: feedwriter script daemon
   Loaded: loaded (/etc/init.d/feedwriter)
   Active: active (running) since Sun 2019-02-03 07:53:41 UTC; 31s ago
  Process: 7078 ExecStop=/etc/init.d/feedwriter stop (code=exited, status=0/SUCCESS)
  Process: 7087 ExecStart=/etc/init.d/feedwriter start (code=exited, status=0/SUCCESS)
   CGroup: /system.slice/feedwriter.service
           └─7117 /usr/bin/php -f /var/www/emoncms/scripts/feedwriter.php

Feb 03 07:53:33 emonpi feedwriter[7087]: Log is turned off
Feb 03 07:53:33 emonpi feedwriter[7087]: Starting RPI
Feb 03 07:53:41 emonpi systemd[1]: Started LSB: feedwriter script daemon.
0 Likes

(Brian Orpin) #28

Hi @andrewhf01

It is most odd that Feedwriter keeps stopping for you.

I’d suggest a reboot may solve it in the short term.

However, there is a new feedwriter service about to be rolled out with the next stable build so I suggest that we revisit it once that has happened - shouldn’t be too long should it @TrystanLea?

0 Likes

(Andrew Findlow) #29

I’ve tried rebooting several times but when I have rebooted I then lose all the data over the period of time that the error has occurred with the feedwriter. If I restart the feedwriter, as suggested further up this thread, then these data are not lost (which is some consolation).
Ok, I’ll wait until the new feedwriter is available.
Thanks,

1 Like

(Mike Nelson) #30

I guess perhaps 20 days I started on a new microsd card with the latest release 9.8.8 beta and have never made any command line changes.

Guess, what… it stopped updating :frowning:

I did not have much time and did not want to loose any more data (14hrs lost) so I just selected the reboot option.

I just took the screen dump which seem to report all the services were running…

Any ideas on a cause?

0 Likes

(Trystan Lea) #31

O dear! and I can see you are running all the latest services using systemd with automatic restart. Perhaps it is related to the issue seen by @borpin, his stopped logging yesterday I think, and may be related to logs filling up - it may also be unconnected. @glyn.hudson has pushed a fix today to fix the log issue. You could try pulling that in with an update and seeing if the problem happens again?

0 Likes

Startup & systemd - a personal view
(Mike Nelson) #32

I have applied the update and will see how it goes…

Many thanks.

1 Like

(Mike Nelson) #33

Just went to check how my one was going and it no longer showed up on the local network. It was working this morning.

The units LCD was displaying WiFi signal level but the button did nothing so I assume a total crash (never had one before).

Unplugged the USB briefly and it is going again…

0 Likes