A new user here and I seem to have an issue with the data transferring between the emonTH and emonPI in that it seems to work from anything from 3 to 6 hours and then stops. I had been pulling the batteries on the emonTH initially and then restarting the emonPI but worked out the issue was with the emonPI so moved to power cycling that when I saw this issue. I have since worked out if I restart Emonhub the data transfer starts again.
I’ve not found anything searching for this issue and at this point I’m not really sure what I need to be searching for.
A message that starts NEW FRAME : OK 19 says that there is data from node 19 - that is usually the emonTH node. You will also see a OK 5 that is the emonPi energy monitoring node.
Hi, yes I do see those records in the logs although hadn’t realised that the logs are cleared on a restart so I need to get the logs next time it goes wrong before I restart anything.
I don’t know if this gives anymore insight into the issue I am seeing or not but I have seen the feed from the emonTH stop twice now with this message displayed.
Did you confirm in emoncms that the emonTH stopped updating when that message came up?
The message indicates that the file you are “tailing” was truncated, truncating the log file is part of the new log rotation process and appears to be simply kicking you out of the tail, if you were to repeat the tail command you would continue to see the updates (if they are still happening). I think this is not linked to your initial issue as emonhub continues to function 100% when the logs are truncated (or at least it used to).
When it fails next, can you check the services on the admin page to see if the emonhub and MQTT services are running? Also can you confirm if any other data continues when the emonTH data stops?
Could this be related to the cause of the truncation?
With --follow (-f), tail defaults to following the file descriptor,
which means that even if a tail’ed file is renamed, tail will
continue to track its end. This default behavior is not desirable
when you really want to track the actual name of the file, not the
file descriptor (e.g., log rotation). Use --follow=name in that
case. That causes tail to track the named file in a way that
accommodates renaming, removal and creation.
I ran it again all day yesterday without a failure until after I shut down the computer I was monitoring it on.
From the previous failures I can confirm that I have seen that the emonTH feed stopped updating at the same time as the last message was recorded in the SSH session. What I cant confirm at the moment is when the truncate message appeared but as I said previously I have seen the emonTH inputs stop updating twice with this message being the last thing in the SSH session watching the log file. I have seen the truncate message at other times in the SSH sessions I have been running and it has continued happily. I have checked the admin page when this happens and everything looks to be running normally but do bare in mind I have only been using this for just over a week.
I also noticed the last time this happened I had to restart emonHub twice before the feed started again.
This is as I expected, the “truncate” error is a red herring regards the issue you are seeking to cure. It’s more an unhelpful annoyance than anything.
This would be normal as you are specifically filtering only the emonTH messages so when the data stops the messages stop (or vice versa?).
Again, to be expected. When the emonTH stops the tail session will continue to be live, albeit with no activity due to the lack of emonTH messages, until the log file is rotated again causing the truncate error.
Not really, the reason the logrotation uses copytruncate is to preserve BOTH the file descriptor and the file name so that the program (emonhub) can continue to log to the exact same file with no change, it’s a rather crude way of getting around handling the rotations in a more robust way.
Although I’m no expert, I’m guessing the tail get orphaned by the truncate because one moment it is tailing the END of a file that suddenly is no longer part of that file as the file is reduced to 0 bytes.
The biggest hurdle here is the size of the logs, originally 2x 5M, they were replaced by 2x 1M which was found time and time again to be way too small and now they are apparently 0.5M.
I think that was supposed to be “500 Kb” as the whole log partition is only 50M for all current logs.
I think the “truncate” message will just need to be “put up with” when using tail. When ever that message pops up just end the tail with ctrl-c, then use the up arrow and enter. That will repeat the last command and the tail will continue, independent of whether the emonTH has failed or not.
So after the truncate message, you go no further data from node 23? If so, very odd. However, you do not know whether the tail stopped before the truncate message. i.e. last message at 19:36, truncate message 30 minutes later.
It certainly tells me the RFM stops getting data from the TH which makes me think it is faulty. @Gwil in the shop may be able to help.
Not really I have explained above already that the truncate error happens independent of the actual error and therefore one must come first and it can be either. if the file truncates before the emonTH stops the tail output stops because of the truncate error or if the emonTH stops the tail session will continue until the file is truncated. It’s a red herring. Of course they can happen around the same time, but even if they don’t it could appear they had.
Whilst it is too early to rule out a rare faulty unit, it could also be the antenna touching the gpio pins, not seated correctly or even more likely a PSU isssue since the rfm will always be the weakest link when it comes to power stability.
The emonPi has a watchdog that reinitialises the rfm every so often, that isn’t in the rfm firmware, but there again the latest firmware has been pretty stable. Assuming this RFM69Pi has the latest FW and isn’t from the back of shelf.
If the RFM69Pi was physically faulty it is less likely to come to life after emonhub is restarted, that would suggest the device is being reset by emonhub restarting the serial connection. This is how the RFM issues used to manifest some years ago, there are many threads on it and at least one of them has a little test script that the user can execute instead of restarting emonhub to reset the RFM2Pi and if the data starts, we know it’s the age old issue and probably a PSU issue if running the latest FW.
When did that change? As I recall all the emonSD images have always rotated the emonhub logs way before the emonhub hardcoded rotation at 5M. Originally by logrotate only at 1M then at 3M by log2ram+logrotate more recently. I was surprised by your comment about only 0.5M but took it you knew something I didn’t.
The emonSD has a global rotation size 100k
however the logrotate config for emonhub has maxsixe 3M
and if no logrotation takes place then eventually emonhub will rotate itself at 5M
so if the rotation is happening at 0.5M I’m not sure where that is coming from.
Nope. I see these truncate errors regularly when tailing the emonhub log, and the tail just keeps going. It isn’t an error, it is an info message telling you the file has been truncated.
I’m certain the output of the tail has stopped before the truncate message (i.e. the receipt of data has stopped).
No idea when it changed, I just know it does. There was a long discussion because at one point the service file didn’t pass the log file parameter so the logs were not rotated. Just checked it is 5M.
Still a faulty unit IMHO as it is new.
@dc1970 - did you buy a DC 5V PSU with the emonPi?
[edit]
Just double checked the logrotate config and that has a maxsize of 3M. Actually, it should be larger than the hardcoded value else there is a risk of conflict. logrotate should be a backstop I’d suggest. It will depend on how much you are logging of course.