Having to restart Emonhub to receive data from emonTH

dc1970 · 22 July 2020 07:18

Hi,

A new user here and I seem to have an issue with the data transferring between the emonTH and emonPI in that it seems to work from anything from 3 to 6 hours and then stops. I had been pulling the batteries on the emonTH initially and then restarting the emonPI but worked out the issue was with the emonPI so moved to power cycling that when I saw this issue. I have since worked out if I restart Emonhub the data transfer starts again.

I’ve not found anything searching for this issue and at this point I’m not really sure what I need to be searching for.

Any one got any ideas on how I can resolve this?

borpin · 22 July 2020 07:41

Hi welcome.

Can you open the admin page and click on Copy as Markdown next to Server Information and paste it here - no further formatting required.

dc1970 · 22 July 2020 07:50

Hi, requested information below.

Server Information

Services

emonhub :- Active Running
emoncms_mqtt :- Active Running
feedwriter :- Active Running - sleep 300s 43 feed points pending write
service-runner :- Active Running
emonPiLCD :- Active Running
redis-server :- Active Running
mosquitto :- Active Running
demandshaper :- Active Running

Emoncms

Version :- low-write 10.2.3
Modules :- Administration | App v2.1.6 | Backup v2.2.3 | EmonHub Config v2.0.5 | Dashboard v2.0.7 | DemandShaper v1.2.5 | Device v2.0.5 | EventProcesses | Feed | Graph v2.0.9 | Input | Postprocess v2.1.4 | CoreProcess | Schedule | Network Setup v1.0.0 | sync | Time | User | Visualisation | WiFi v2.0.2
Git :-
- URL :- GitHub - emoncms/emoncms: Web-app for processing, logging and visualising energy, temperature and other environmental data
- Branch :- * stable
- Describe :- 10.2.3

Server

OS :- Linux 4.19.75-v7+
Host :- emonpi | emonpi | (192.168.1.34)
Date :- 2020-07-22 08:48:17 BST
Uptime :- 08:48:17 up 12:41, 0 users, load average: 0.03, 0.09, 0.09

Memory

RAM :- Used: 19.47%
- Total :- 975.62 MB
- Used :- 189.98 MB
- Free :- 785.64 MB
Swap :- Used: 0.00%
- Total :- 100 MB
- Used :- 0 B
- Free :- 100 MB
  Write Load Period

Disk

/ :- Used: 51.11%
- Total :- 3.92 GB
- Used :- 2.01 GB
- Free :- 1.72 GB
- Write Load :- 806.52 B/s (12 hours 11 mins)
/boot :- Used: 20.55%
- Total :- 252.05 MB
- Used :- 51.79 MB
- Free :- 200.26 MB
- Write Load :- 0 B/s (12 hours 11 mins)
/var/opt/emoncms :- Used: 0.05%
- Total :- 9.98 GB
- Used :- 4.95 MB
- Free :- 9.47 GB
- Write Load :- 68.91 B/s (12 hours 11 mins)
/var/log :- Used: 3.91%
- Total :- 50 MB
- Used :- 1.95 MB
- Free :- 48.05 MB
- Write Load :- n/a

HTTP

Server :- Apache/2.4.38 (Raspbian) HTTP/1.1 CGI/1.1 80

MySQL

Version :- 5.5.5-10.3.17-MariaDB-0+deb10u1
Host :- localhost:6379 (127.0.0.1)
Date :- 2020-07-22 08:48:17 (UTC 01:00‌)
Stats :- Uptime: 46230 Threads: 9 Questions: 126359 Slow queries: 0 Opens: 51 Flush tables: 1 Open tables: 42 Queries per second avg: 2.733

Redis

Version :-
- Redis Server :- 5.0.3
- PHP Redis :- 5.0.2
Host :- localhost:6379
Size :- 129 keys (837.03K)
Uptime :- 0 days

MQTT Server

Version :- Mosquitto 1.5.7
Host :- localhost:1883 (127.0.0.1)

PHP

Version :- 7.3.9-1~deb10u1 (Zend Version 3.3.9)
Modules :- apache2handlercalendar Core ctype curl date dom v20031129exif fileinfo filter ftp gd gettext hash iconv json v1.7.0libxml mbstring mosquitto v0.4.0mysqli mysqlnd vmysqlnd 5.0.12-dev - 20150407 - $Id: 7cc7cc96e675f6d72e5cf0f267f48e167c2abb23 $openssl pcre PDO pdo_mysql Phar posix readline redis v5.0.2Reflection session shmop SimpleXML sockets sodium SPL standard sysvmsg sysvsem sysvshm tokenizer wddx xml xmlreader xmlwriter xsl Zend OPcache zlib

Pi

Model :- Raspberry Pi 3 Model B Rev 1.2 - 1GB (Embest)
Serial num. :- 539EE22D
CPU Temperature :- 44.55°C
GPU Temperature :- 44.0°C
emonpiRelease :- emonSD-17Oct19
File-system :- read-write

Client Information

HTTP

Browser :- Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/84.0.4147.89 Safari/537.36
Language :- en-US,en;q=0.9

Window

Size :- 1903 x 967

Screen

Resolution :- 1920 x 1200

borpin · 22 July 2020 09:00

Thanks,

On the emonhub page, Do you see DEBUG messages?

If not, please click on the Edit Config button and in the window that appears change the log level to DEBUG.

Click Save - no need to restart emonhub, go back to the log window and you should see DEBUG messages.

You are looking for messages that say NEW FRAME e.g.

2020-07-22 09:54:42,634 DEBUG    RFM2Pi     68399 NEW FRAME : OK 19 221 0 0 0 195 1 26 0 (-66)
2020-07-22 09:54:42,636 DEBUG    RFM2Pi     68399 Timestamp : 1595408082.6345186
2020-07-22 09:54:42,637 DEBUG    RFM2Pi     68399 From Node : 19
2020-07-22 09:54:42,637 DEBUG    RFM2Pi     68399    Values : [22.1, 0, 45.1, 2.6]
2020-07-22 09:54:42,638 DEBUG    RFM2Pi     68399      RSSI : -66

A message that starts NEW FRAME : OK 19 says that there is data from node 19 - that is usually the emonTH node. You will also see a OK 5 that is the emonPi energy monitoring node.

Can you post the equivalent to above please.

dc1970 · 22 July 2020 09:14

Hi, yes I do see those records in the logs although hadn’t realised that the logs are cleared on a restart so I need to get the logs next time it goes wrong before I restart anything.

2020-07-22 08:20:39,505 DEBUG    RFM2Pi     445 NEW FRAME : OK 23 214 0 0 0 10 2 30 0 1 0 0 0 (-41)
2020-07-22 08:20:39,507 DEBUG    RFM2Pi     445 Timestamp : 1595402439.5056157
2020-07-22 08:20:39,514 DEBUG    RFM2Pi     445 From Node : 23
2020-07-22 08:20:39,520 DEBUG    RFM2Pi     445    Values : [21.400000000000002, 0, 52.2, 3, 1]
2020-07-22 08:20:39,525 DEBUG    RFM2Pi     445      RSSI : -41
2020-07-22 08:20:39,531 DEBUG    RFM2Pi     445 Sent to channel(start)' : ToEmonCMS
2020-07-22 08:20:39,537 DEBUG    RFM2Pi     445 Sent to channel(end)' : ToEmonCMS
2020-07-22 08:20:39,549 DEBUG    MQTT       Publishing: emon/emonth5/temperature 21.400000000000002
2020-07-22 08:20:39,551 DEBUG    MQTT       Publishing: emon/emonth5/external temperature 0
2020-07-22 08:20:39,552 DEBUG    MQTT       Publishing: emon/emonth5/humidity 52.2
2020-07-22 08:20:39,553 DEBUG    MQTT       Publishing: emon/emonth5/battery 3
2020-07-22 08:20:39,555 DEBUG    MQTT       Publishing: emon/emonth5/pulsecount 1
2020-07-22 08:20:39,556 DEBUG    MQTT       Publishing: emon/emonth5/rssi -41
2020-07-22 08:20:39,558 INFO     MQTT       Publishing: emonhub/rx/23/values 21.400000000000002,0,52.2,3,1,-41

borpin · 22 July 2020 09:17

@dc1970,

When posting code or terminal output, please put 3 backticks (top left of keyboard) on a line before and on a line after, so it formats correctly.

The emonhub logs rotate at 500 Mb so if it continues to receive data from elsewhere, the logs will rotate.

Do you have more than one emonTH? If so, is it just that one that fails?

dc1970 · 22 July 2020 09:21

No this is my only one, only received the emonPI and emonTH on Friday and first spotted the issue on Saturday.

borpin · 22 July 2020 09:55

Have you enabled SSH & are you comfortable with using it?

dc1970 · 22 July 2020 10:22

No I haven’t enabled it but have used in previously with other pi’s.

borpin · 22 July 2020 10:26

Ok good. I suggest 2 sessions

tail -f /var/log/emonhub/emonhub.log | grep "NEW FRAME : OK 23"

And you will see when data is received (or not)

journalctl -f -u emonhub

should pick up any exceptions in the python.

dc1970 · 22 July 2020 10:40

OK, thanks for your help, I will give it a try later.

dc1970 · 24 July 2020 19:44

Hi,

I don’t know if this gives anymore insight into the issue I am seeing or not but I have seen the feed from the emonTH stop twice now with this message displayed.

2020-07-24 19:33:29,527 DEBUG    RFM2Pi     8157 NEW FRAME : OK 23 28 1 0 0 135 1 31 0 1 0 0 0 (-73)
2020-07-24 19:34:28,819 DEBUG    RFM2Pi     8170 NEW FRAME : OK 23 28 1 0 0 136 1 31 0 1 0 0 0 (-74)
2020-07-24 19:35:28,019 DEBUG    RFM2Pi     8183 NEW FRAME : OK 23 28 1 0 0 136 1 31 0 1 0 0 0 (-75)
2020-07-24 19:36:27,383 DEBUG    RFM2Pi     8196 NEW FRAME : OK 23 28 1 0 0 136 1 31 0 1 0 0 0 (-74)
tail: /var/log/emonhub/emonhub.log: file truncated

pb66 · 25 July 2020 20:39

Did you confirm in emoncms that the emonTH stopped updating when that message came up?

The message indicates that the file you are “tailing” was truncated, truncating the log file is part of the new log rotation process and appears to be simply kicking you out of the tail, if you were to repeat the tail command you would continue to see the updates (if they are still happening). I think this is not linked to your initial issue as emonhub continues to function 100% when the logs are truncated (or at least it used to).

When it fails next, can you check the services on the admin page to see if the emonhub and MQTT services are running? Also can you confirm if any other data continues when the emonTH data stops?

Bill.Thomson · 26 July 2020 01:13

Could this be related to the cause of the truncation?

With --follow (-f), tail defaults to following the file descriptor,
which means that even if a tail’ed file is renamed, tail will
continue to track its end. This default behavior is not desirable
when you really want to track the actual name of the file, not the
file descriptor (e.g., log rotation). Use --follow=name in that
case. That causes tail to track the named file in a way that
accommodates renaming, removal and creation.

Ref: tail(1) - Linux manual page

dc1970 · 26 July 2020 07:54

I ran it again all day yesterday without a failure until after I shut down the computer I was monitoring it on.

From the previous failures I can confirm that I have seen that the emonTH feed stopped updating at the same time as the last message was recorded in the SSH session. What I cant confirm at the moment is when the truncate message appeared but as I said previously I have seen the emonTH inputs stop updating twice with this message being the last thing in the SSH session watching the log file. I have seen the truncate message at other times in the SSH sessions I have been running and it has continued happily. I have checked the admin page when this happens and everything looks to be running normally but do bare in mind I have only been using this for just over a week.

I also noticed the last time this happened I had to restart emonHub twice before the feed started again.

pb66 · 26 July 2020 09:17

This is as I expected, the “truncate” error is a red herring regards the issue you are seeking to cure. It’s more an unhelpful annoyance than anything.

This would be normal as you are specifically filtering only the emonTH messages so when the data stops the messages stop (or vice versa?).

Again, to be expected. When the emonTH stops the tail session will continue to be live, albeit with no activity due to the lack of emonTH messages, until the log file is rotated again causing the truncate error.

Not really, the reason the logrotation uses copytruncate is to preserve BOTH the file descriptor and the file name so that the program (emonhub) can continue to log to the exact same file with no change, it’s a rather crude way of getting around handling the rotations in a more robust way.

Although I’m no expert, I’m guessing the tail get orphaned by the truncate because one moment it is tailing the END of a file that suddenly is no longer part of that file as the file is reduced to 0 bytes.

The biggest hurdle here is the size of the logs, originally 2x 5M, they were replaced by 2x 1M which was found time and time again to be way too small and now they are apparently 0.5M.

I think that was supposed to be “500 Kb” as the whole log partition is only 50M for all current logs.

I think the “truncate” message will just need to be “put up with” when using tail. When ever that message pops up just end the tail with ctrl-c, then use the up arrow and enter. That will repeat the last command and the tail will continue, independent of whether the emonTH has failed or not.

borpin · 26 July 2020 12:13

Yes sorry (edited).

So after the truncate message, you go no further data from node 23? If so, very odd. However, you do not know whether the tail stopped before the truncate message. i.e. last message at 19:36, truncate message 30 minutes later.

It certainly tells me the RFM stops getting data from the TH which makes me think it is faulty. @Gwil in the shop may be able to help.

borpin · 26 July 2020 12:14

emonhub logs are rotated in code by emonhub, not by logrotate.

pb66 · 26 July 2020 14:02

Not really I have explained above already that the truncate error happens independent of the actual error and therefore one must come first and it can be either. if the file truncates before the emonTH stops the tail output stops because of the truncate error or if the emonTH stops the tail session will continue until the file is truncated. It’s a red herring. Of course they can happen around the same time, but even if they don’t it could appear they had.

Whilst it is too early to rule out a rare faulty unit, it could also be the antenna touching the gpio pins, not seated correctly or even more likely a PSU isssue since the rfm will always be the weakest link when it comes to power stability.

The emonPi has a watchdog that reinitialises the rfm every so often, that isn’t in the rfm firmware, but there again the latest firmware has been pretty stable. Assuming this RFM69Pi has the latest FW and isn’t from the back of shelf.

If the RFM69Pi was physically faulty it is less likely to come to life after emonhub is restarted, that would suggest the device is being reset by emonhub restarting the serial connection. This is how the RFM issues used to manifest some years ago, there are many threads on it and at least one of them has a little test script that the user can execute instead of restarting emonhub to reset the RFM2Pi and if the data starts, we know it’s the age old issue and probably a PSU issue if running the latest FW.

When did that change? As I recall all the emonSD images have always rotated the emonhub logs way before the emonhub hardcoded rotation at 5M. Originally by logrotate only at 1M then at 3M by log2ram+logrotate more recently. I was surprised by your comment about only 0.5M but took it you knew something I didn’t.

The emonSD has a global rotation size 100k

however the logrotate config for emonhub has maxsixe 3M

and if no logrotation takes place then eventually emonhub will rotate itself at 5M

github.com

openenergymonitor/emonhub/blob/master/src/emonhub.py#L289-L290


      
          # Command line arguments parser
          parser = argparse.ArgumentParser(description='OpenEnergyMonitor emonHub')

so if the rotation is happening at 0.5M I’m not sure where that is coming from.

borpin · 26 July 2020 14:31

Nope. I see these truncate errors regularly when tailing the emonhub log, and the tail just keeps going. It isn’t an error, it is an info message telling you the file has been truncated.

I’m certain the output of the tail has stopped before the truncate message (i.e. the receipt of data has stopped).

No idea when it changed, I just know it does. There was a long discussion because at one point the service file didn’t pass the log file parameter so the logs were not rotated. Just checked it is 5M.

github.com

openenergymonitor/emonhub/blob/6f81ae0b1b355b3d4644a33905379b934923c41d/src/emonhub.py#L289


      
              sys.exit()
          
          # Logging configuration
          logger = logging.getLogger("EmonHub")
          if args.logfile:
              # If path was given, rotating logging over two 5 MB files
              # If logfile is supplied, argparse opens the file in append mode,
              # this ensures it is writable
              # Close the file for now and get its path
              args.logfile.close()
              loghandler = logging.handlers.RotatingFileHandler(args.logfile.name,
                                                                'a', 5000 * 1024, 1)
          else:
              # Otherwise, if no path was specified, everything goes to sys.stderr
              loghandler = logging.StreamHandler()
          # Format log strings
          loghandler.setFormatter(logging.Formatter(
              '%(asctime)s %(levelname)-8s %(threadName)-10s %(message)s'))
          
          logger.addHandler(loghandler)

Still a faulty unit IMHO as it is new.

@dc1970 - did you buy a DC 5V PSU with the emonPi?

[edit]
Just double checked the logrotate config and that has a maxsize of 3M. Actually, it should be larger than the hardcoded value else there is a risk of conflict. logrotate should be a backstop I’d suggest. It will depend on how much you are logging of course.

github.com

openenergymonitor/EmonScripts/blob/b9bd920e87c659a9846acd438ff9a0face3add9e/defaults/etc/logrotate.d/emonhub#L1


      
          /var/log/emonhub/emonhub.log {
              maxsize 3M
          
              norenamecopy
              copytruncate
              su root root
              compress
              
              olddir /var/log.old/emonhub
              createolddir 775 root emonhub
          }

@TrystanLea comment?