Emonhub out of control

My emonpi has been very reliable until I upgraded to the latest 01Feb24 image. I’ve reinstalled it several times now on different microSD cards, but after a random period it all goes horribly wrong.

I think the root cause is related to Emonhub. When all is working well, the log shows a reading every 10 seconds or so, see “emonhub.good.log” in zip file attached. Then at some point (no pattern identified), it starts updating every 0.3 seconds, as per “emonhub.bad.log” in zip file attached. This in turn affects the inputs and feeds. First of all, new inputs are generated, even though autoconf is turned off, see screenshots in zip file, and the inputs which flow into my feeds are wrong. For example, it’s now pitch-black outside and apparently I’m still generating 142W of solar power, see screenshot in zip file.

Unfortunately, restarting the Emonhub service or even a full reboot of the EmonPi itself does not make a difference, i.e. it keeps logging at the ~0.3 second interval.

When I stop the emonhub service and go to Serial Monitor and connect, all appears normal, i.e. the readings come through every~10 seconds, see screenshot attached. If I then stop the Serial Monitor and restart the Emonhub service, all works fine again!

In a previous post I asked for troubleshooting tips to try and gather useful information when these sort of issues arise, but unfortunately no feedback was received. Perhaps I’m the only one having this issue, but if somebody can help me troubleshoot that would be much appreciated. Failing that, I think my only option is to go back to 20Nov23 image, and I hope downgrading is supported in terms of migrating the data.

Thanks in advance for any help.
emonpi.zip (280.9 KB)

Pieter’s previous appeal for help:

I can’t remember seeing anything resembling this before. It looks very much like something for @TrystanLea to answer.

Looking closely at the energy values, I wonder whether your thinking

is right, because the message count, which originates in the analogue front end, is incrementing by one most times (it went up by 6 in 250 ms in one place) and the energy is incrementing slowly as I would expect if the front end had been commanded to report at ridiculously short intervals. It looks to me as if a wrong setting has got sent from somewhere, but I can’t explain how connecting the Serial Monitor rectifies the problem. As far as I know, only Trystan knows exactly what is sent when the serial monitor is started.

Thanks Robert,

Please ignore the symptoms I described in my previous post, as I’ve reinstalled the 01Feb24 image a few times since then, removed all the unused nodes, and I’ve updated the emonhub config to be as close to the default config file which comes with the latest image, but specifically disabling autoconf, renaming node 5 to be the same as before and disabling the SPI interfacer (just in case). Also, DemandShaper is no longer installed.

Checking the emonpi interface this morning, nothing is being logged by Emonhub (black screen) even though it’s set to debug, and when I download the emonhub.log file it is empty. However, the Inputs and Feeds are updating as normal.

One thing to add is that I’ve got a separate Raspberry Pi with Home Assistant software reading from the emonpi through the Emoncms integration, and I’m writing to the emonpi wit the Emoncms History integration (solar power forecast value only), both every 10 seconds. I wouldn’t expect this to have anything to do with the symptoms I’m experiencing, but thought I’d mention it just in case.

I do NOT understand that, from what I can see, if it is loglevel = DEBUG in the file itself, then it writes to the log. If it is anything other than this at the bottom of the screen

image
then it only sends that level of error to the screen, and the log carries on according to the file setting.

The only thing I can suggest is carefully check emonhub.conf for the spurious character that’s causing it problems.

This is definitely a question for someone who knows emonHub & emonCMS better than I do.

Thanks Robert. Yes, it’s a strange one :wink:. Logging level is set to debug and I just checked again, and it is logging now, with the log file dating back to just over an hour ago. For the avoidance of doubt, I didn’t change anything…

Attached is my emonhub.conf file (zipped). I don’t think there is anything wrong with it, but let me know if you spot anything…
emonhub.zip (1.5 KB)

It got away.
Drag and drop the file into the message.

Apologies Robert, message edited and file added.

No, I’m afraid nothing jumps out at me.

Thanks for checking Robert.

Given the inconsistent logging, I’m thinking it might be an idea to enable the syslog option to a separate syslog server, so we can see when logging starts and stops, and hopefully with some additional clues. However, I can’t find any instructions on how to configure the syslog option. I’ll need to uncomment the “use_syslog = yes” line, but how do I tell it where to send the data to?

I can’t answer this, I’m not Linux/Pi expert, sorry.

Well, it’s happened again. I now have a new node 1 where data is being received instead of node 5 (see screenshots). Nothing is being logged in Emonhub; not on the screen and emonhub.log is empty despite log level set to debug.

Restarted the Emonhub service and logging came back, but still to node 1, see emonhub.bad.log file.

Stopped Emonub service and started Serial Monitor. Data comes through for node 5, see screenshot.

Stopped Serial Monitor and started Emonhub, now correctly logging to node 5 again, see emonhub.good.log file.

At least I now know how to quickly recover from the issue, but it would obviously be better if we understood why this is happening so it can be fixed :wink:. Happy to test, troubleshoot, etc…
6may.zip (73.9 KB)

What physical sort of an emonPi is it (not that this should make any difference, but what is happening is so strange that I won’t assume anything).

We really need to get @Trystan or @glyn.hudson involved.

Hi Robert,

It’s the ‘standard’ emonPi as purchased in August 2019 (Solar PV bundle).

In this case, I’m wondering whether the increase in baud rate from 38400 to 115200 leads to an increase in transmission errors, and ultimately the symptoms I’m experiencing. From the emonPi github page, I see the 115200 baud rate was introduced as part of the 1.1.4 firmware version. I know that I initially used an earlier version of the continuous monitoring firmware (I think it was 1.1.1) and had no issues, so is there a way I can downgrade to 1.1.3 and reduce the baud rate to see if that makes a difference?

What else do you have in your system working to or with the emonPi? I couldn’t see below Node 5 in your emonhub.conf - not that this means a lot.

Hi Robert,

Only this:

and this script which runs every five minutes to upload data to pvoutput.org.

For the avoidance of doubt, the emonhub.conf file only contains Node 5.

ahhhh, yes… I had similar symptoms, drove me crazy, including the runaway emonhub and spurious inputs…

Hopefully this is the fix:

Wow, @alandpearson, that’s a great bit of detective work and it makes perfect sense! I’m glad it wasn’t just me experiencing those symptoms. I can confirm I also had the “console=serial0,115200” entry in the /boot/firmware/cmdline.txt file. I have now deleted that part, saved the file and rebooted my emonPi. If it keeps working fine over the next few days I’ll mark your feedback as the solution. I think your post should then also be pinned, as this is issue is likely to affect others as well. Thanks again :smile: :+1:

1 Like

:+1: to Alan. I wouldn’t have known where to look for the cause of the problem, but take some comfort from saying that it had to be because the front end was being reset, even though I didn’t know the mechanism that made it possible.

1 Like

This is another issue with emonsd feb2024 I haven’t got to the bottom of, but seems related to log2ram. It appears emonhub log files get moved regularly from the ram filesystem to /var/log.old but then logging stops altogether. I first thought it was logrotated causing the issue, but it is not.

Great that you are looking into that issue as well. The strange thing is that it comes and goes, i.e. after not logging for a while it starts again by itself. It’s not too big a deal for me now the “real issue” has been fixed (most likely!), but it would be nice to get it sorted of course :wink:. Keep us posted please, thanks :+1: