Random Hangs of emonPi/emonSD

Hi

I’m looking for some guidance on an issue I am having with my emonPi/emonSD - I’m not sure where the problem actually is though!

Recently (maybe every month or so), my emonPi just hangs, stops capturing any feeds or shipping data to EmonCMS. I don’t always notice this for a day or two…invariably I find out it’s happened again when I see the EmonCMS app show me some nonsensical graph.

Once it gets into this state, I am always able to ping the device and access the emonSD admin screen etc. If I reboot the device, when it comes back there is still no monitoring of anything…and I actually have to shutdown the device, disconnect the power to do a full cold reboot of the device…at which point everything then comes back working normally and monitoring starts again.

I’m not sure where to start hunting down the issue here…any guidance or thoughts?

Thanks
Robin

Anything in the logs (emoncms/emonhub/syslog)?

Sadly, I don’t seen anything of note in any of the log files. The last ‘hang’ took place sometime on the morning July 2nd…and there’s nothing jumping out at me in any of the logs around that timeframe to say there’s been an issue. :face_with_diagonal_mouth:

Thanks
Robin

Are there still entries in the emonhub Log? Are there entries in the emoncms log? Are they both just empty?

Just because it isn’t jumping out at you, doesn’t mean there isn’t something of note…

Do you just use the EmonPi on it’s own? No other sensors feeding in?

Are all the services showing as running?

Hi Brian

Sorry for being so imprecise. There are a variety of logs in the /var/log.old directory, but in the emonhub directory, there are only more recent files. In the emoncms log directory there are loads of entries (see image) - but when uncompressed the files all seem empty.

The EmonPi is used with a current clamp, and four wireless temperature sensors. When one of these ‘hang events’ occurs, I can open the emoncms admin page and everything seems to be running…yet no data is being collected from any sensors. Reboot and everything still seems to run, yet still no sensor data is collected. Power down and the start up again, everything comes back successful and data is once again collected. Everything is running OK now…and I’ll probably have to just wait a few weeks until it hangs again to gather some more information.

Here is the server information

System Info
Server Information

Server Information

Services

  • emonhub :- Active Running

  • emoncms_mqtt :- Active Running

  • feedwriter :- Active Running - sleep 300s 14 feed points pending write

  • service-runner :- Active Running

  • emonPiLCD :- Active Running

  • redis-server :- Active Running

  • mosquitto :- Active Running

  • demandshaper :- Active Running

Emoncms

Server

  • CPU :- 1 Threads(s) | 4 Core(s) | 1 Sockets(s) | Cortex-A53 | 59.73MIPS |
  • OS :- Linux 5.4.51-v7+
  • Host :- emonpi | emonpi | (192.168.2.10)
  • Date :- 2022-07-05 22:37:08 BST
  • Uptime :- 22:37:08 up 5:55, 1 user, load average: 0.11, 0.10, 0.11

Memory

  • RAM :- Used: 19.25%
    • Total :- 925.85 MB
    • Used :- 178.25 MB
    • Free :- 747.61 MB
  • Swap :- Used: 0.00%
    • Total :- 100 MB
    • Used :- 0 B
    • Free :- 100 MB

Disk

  • **** :- - / :- Used: 49.75%
    • Total :- 4.06 GB
    • Used :- 2.02 GB
    • Free :- 1.85 GB
    • Read Load :- 69.17 B/s
    • Write Load :- 161.39 B/s
    • Load Time :- 5 hours 52 mins
  • /var/opt/emoncms :- Used: 0.49%
    • Total :- 24.86 GB
    • Used :- 125.18 MB
    • Free :- 23.47 GB
    • Read Load :- 63.98 B/s
    • Write Load :- 73.72 B/s
    • Load Time :- 5 hours 52 mins
  • /boot :- Used: 21.52%
    • Total :- 252.05 MB
    • Used :- 54.23 MB
    • Free :- 197.81 MB
    • Read Load :- 0 B/s
    • Write Load :- 0 B/s
    • Load Time :- 5 hours 52 mins
  • /var/log :- Used: 6.96%
    • Total :- 50 MB
    • Used :- 3.48 MB
    • Free :- 46.52 MB
    • Read Load :- n/a
    • Write Load :- n/a
    • Load Time :- n/a

HTTP

  • Server :- Apache/2.4.38 (Raspbian) HTTP/1.1 CGI/1.1 80

MySQL

  • Version :- 5.5.5-10.3.23-MariaDB-0+deb10u1
  • Host :- 127.0.0.1 (127.0.0.1)
  • Date :- 2022-07-05 22:37:08 (UTC 01:00‌​)
  • Stats :- Uptime: 86005 Threads: 13 Questions: 949 Slow queries: 0 Opens: 46 Flush tables: 1 Open tables: 40 Queries per second avg: 0.011

Redis

  • Version :-
    • Redis Server :- 5.0.3
    • PHP Redis :- 5.3.1
  • Host :- localhost:6379
  • Size :- 492 keys (926.36K)
  • Uptime :- 0 days

MQTT Server

  • Version :- Mosquitto 1.5.7
  • Host :- localhost:1883 (127.0.0.1)

PHP

  • Version :- 7.3.19-1~deb10u1 (Zend Version 3.3.19)
  • Run user :- User: www-data Group: www-data video Script Owner: pi
  • Modules :- apache2handlercalendar Core ctype curl date dom v20031129exif fileinfo filter ftp gd gettext hash iconv json v1.7.0libxml mbstring mosquitto v0.4.0mysqli mysqlnd vmysqlnd 5.0.12-dev - 20150407 - $Id: 7cc7cc96e675f6d72e5cf0f267f48e167c2abb23 $openssl pcre PDO pdo_mysql Phar posix readline redis v5.3.1Reflection session shmop SimpleXML sockets sodium SPL standard sysvmsg sysvsem sysvshm tokenizer wddx xml xmlreader xmlwriter xsl Zend OPcache zlib

Pi

  • Model :- Raspberry Pi 3 Model B+ Rev 1.3 - 1GB (Sony UK)

  • Serial num. :- CEEE5A02

  • CPU Temperature :- 51.54°C

  • GPU Temperature :- 51.5°C

  • emonpiRelease :- emonSD-24Jul20

  • File-system :- read-write

Client Information

Client Information

HTTP

  • Browser :- Mozilla/5.0 (Macintosh; Intel Mac OS X 10_14_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/102.0.0.0 Safari/537.36
  • Language :- en-GB,en-US;q=0.9,en;q=0.8

Window

  • Size :- 1583 x 886

Screen

  • Resolution :- 1440 x 900

That sounds like the issue is with the front-end electronics, as rebooting does not reset the
328 microcontroller, but when the power is cycled, both the RPi and the 328 µC get reset.

@robert.wall, your thoughts?

1 Like

That’s possible. First thoughts are a power supply issue. It has been known for a brown-out to lock up the RFM69CW, which will in turn hang everything else.

Unfortunately, the reset line on the RFM69CW is not used, so it’s not possible to reset it in software.

I’ve also had this problem where the data logging inexplicably stops and a power cycle is the only fix.
I have learned to live with it, since I was never able to figure out why it does it, and it only occurs two or three times each year,
However, something caught my eye a while back when I was looking at the EmonPi Schematic and I have wondered about it ever since.
I noticed that the ATMega328 microprocessor was running from a 3.3V supply, but has a 16MHz crystal. I’m sure I read somewhere that 16MHz was only supported when running on a 5V supply, If that’s true then I wondered if this was the cause of the spurious behaviour.

1 Like

This has been discussed here before:

I think my emonPi has got into this state maybe 3 times this year so far. It seems unpredictable (as in I can’t correlate any other factors right now) but I’m going to keep a close eye on it from now on.

I could probably change the PSU for one of a number of 5V PSUs I have lying around, if that was a potential causal factor. I could even switch it to an open frame Bel Power linear 5V 3A PSU (that I had planned for my software defined radio gear) but would need to do some work to enclose it first.

It’s certainly a headscratcher for me

It’s the PSU. It’s always the wretched PSU.
Don’t change it for some random 5V PSU, buy a decent RPi PSU. They’re usually a little higher than 5V, to allow for a bit of voltage drop when things get busy.

3 Likes

As well as have ample current capacity, not to mention a much cleaner output than
the typical USB charger.

1 Like

Trystan did some tests ages ago: Not all USB power supplies are created the same - Blog | OpenEnergyMonitor

1 Like

Had something similar happen to me because the Pi was running out of disk space.
Something in the EmonCMS UI was triggering the web server to write a lot of errors to its log file which would then automatically get cleaned up destroying the evidence of filling up the disk with error logs.
Maybe check it with “df” and make sure you have plenty of space.
Otherwise might need to change some linux logging settings which is what I did to get it to stop logging errors to disk.

Wish I could remember what it was that was causing all those errors. Maybe some kind of graph object or dashboard object that was referencing a feed that I had deleted or renamed or something like that. I remember having a lot of errors about something like that a while ago.

Mine is just plugged into a USB Socket as part of a 13A wall socket :laughing: It’s rock solid.

Just your luck I think.

This has been a problem in the past, but larger cards mean it is rarely an issue. You can check the available space from the Admin page (which is just a df call IIRC).

On logs, again, a load of work was done to make sure these rotate correctly and saved from the RAMLOG to disk. However, that looks as if it might be broken now.

That is odd. @TrystanLea - this probably needs investigating as it should not be the case…

1 Like

It looks like there is plenty of diskspace. The emonpi has been up for 8 days since the last hang…I’m keeping an eye on it to see when it happens next

I fully agree. My Red Pitayas (which I run as software defined radios) have similar problems with flaky 5V PSUs. When the FPGAs are running full tilt, any weak 5v PSU fails pretty quickly (i.e. within a few minutes). To reduce the SMPS induced rfi for my radios, I am switching to Bel Power HB5-3/OVP-AG linear 5V/3A supplies for the Red Pitayas. That might be overkill for the emonpi though

I will secure another good quality RPi PSU in case it’s the PSU. The emonpi has been running for 8 days since the last crash…ad seems OK at the moment.

Anyway to log low voltage events? My little USB UPS reduces PSU stresses:

Have a look at this thread - it describes a similar situation.
I implemented the reset script and set it to trigger once a day from crontab. The problem seems to have gone.

I experience more or less the same freezes which causes wrong kwh/d processing.
I do use a Raspberry Pi Zero 2 W with emonscripts to install emonCMS.

Today my emonCMS was very slow/laggy but I managed to login and check the admin page
After 4 dats of uptime:

After a reboot:

Could it be that the full swap file is causing this? Or the almost maxed out 512MB RAM from the RPi Z2W?