Mqtt_input service high memory usage

[Discussion moved from Emonpi low-write 9.8.25 - inputs stopped working - #51]

I just noticed an issue on pukka’s install with the mqtt_input service.

The RAM usage is steadily increasing.

As of now, we are at 43.0% [418.4 MB] used by the mqtt_input service alone.

I did check and the data is being written to phpfina so I’m unsure as to why the service is still holding it in ram.

At this rate were creeping toward a crash.

Any idea how to resolve?

1 Like

Not with just the information given, no.

I know very little about the mqtt_input service, but maybe try stopping each source of MQTT in turn to see if it stops the RAM usage increase, assuming it is data that’s being held, it might not reduce in size but it might stop growing.

Interesting, I’ve not seen this before. Do you think it’s worth splitting this out into a separate thread since this is another issue to the original issue of this thread? Done.

Could you post output of $sudo top to show the memory use of mqtt_service?

What happened in the last few days, did the memory use continue to increase? Did the system crash?

Is @Pukka running a stock emonSD setup? Have there been any changes made?

What version of Mosquitto, Redis and Emoncms and Emonhub is @Pukka running

It may be too early to know, but I wonder if this is related to a problem @stabuck is having with a new image? (see emonPi hangs and stops logging - #34 by stabuck) He reports that there are no inputs updating locally on his emonPi and that the sudo systemctl restart mqtt_input command “just hangs”.

For reference this is what top output from my emonPi looks like

As far as the system goes, its a “low-write 9.8.28” fully updated as per the discussion in Emonpi low-write 9.8.25 - inputs stopped working

The RAM usage has been increasing in line with the MQTT data feed.

So far no crash, but were at 75.1% ram usage and stability will become an issue soon.

emoncms is pretty stock.

Versions:
mosquitto v0.3.0
redis v2.2.7
low-write 9.8.28 | 2018.01.27
EmonHub Config v1.0.0

But there is a fair bit going on on the inputs side

Can we assume the node-red modbus stuff is MQTT? What about the BMW and OpenEVSE? Is the weather MQTT too?

How often are all these updated? How many topics is each one? How much input processing is on each input? What QoS level does each one use?

All the screenshots tell us is that the mqtt_input is using a lot of resources, the biggest part of debugging this is to work out what it is actually doing.

So after researching the matter, the data that is being written is all of the data sent to MQTT from the node-red flows (weather, solax(modbus) and solax 2(modbus), and EmonPi.

Those are the only 3 flows in node-red to send data our via MQTT.

emonTH5, BMWi3, openevse, IotaWatt, and EmonTX-O are not being saved to the RAM.

All feeds are PHPFINA, and all the data is being written correctly, its just the 4 (weather, solax, solax 2, and emonpi) that are being retained in the RAM as well as PHPFINA.

Input processing made no difference, some inputs have no processing, some are just log to feed, and some have several processes attached. All are showing up in the RAM.

For example emonpi t2-6 are not used, however, ALL show up in RAM as “0”

openevse and IotaWatt also send the data to MQTT, however, those are not being stored in RAM.

for troubleshooting purposes, the script associated with the ram usage is /var/www/emoncms/scripts/phpmqtt_input.php, and it maps that specific data as [anon] and [heap]
both of which I figured was the case anyway, so it was no surprise when I checked.

So does, anyone have any ideas as to why some data is being stored in RAM and not others?
Is it possible that this was broken by an update, whereas not all scripts were updated as they should and were missing a function update?

I’m hoping someone who was involved in writing this script can weigh in and narrow this down.

I still have not found where the issue is with @pukka 's install.

I have done a memory dump and came up with the following:
Does anyone have any idea what to make of this? It appears that some of each string is missing.
This is from the memory range mapped to this particular service.

Filtered data:
emon/Solax 2/GridFrequency/
emon/Solax/Powerdc2/ put:2
emon/Solax/PvCurrent2/ put:1
emon/Solax/PvVoltage2/ put:1
emon/Solax/Powerdc1/ input:1
emon/Solax/PvCurrent1/ 000 1
emon/Solax/PvVoltage1/ put:1
emon/Solax 2/PvVoltage1/ t:1
emon/Solax 2/PvCurrent1/ t:1
emon/Solax 2/Powerdc1/ put:1
emon/Solax 2/PvVoltage2/ 0 1
emon/Solax 2/PvCurrent2/ t:1
emon/Solax 2/Powerdc2/ put:1
emon/Solax 2/Temperature/ :1
emon/Solax 2/TodayKwh/ put:1
emon/Solax 2/GridPower/ 00 1
emon/Solax 2/BatVoltage/ 0 1
emon/Solax 2/OutputW/ ut:2
emon/Solax/BatVoltage/ 000
emon/Solax 2/BatTemp/ ut:2
emon/Solax 2/BatLevel/ put:1
emon/Solax 2/BatPower/ put:1
emon/Solax 2/BatCurrent/ t

Screenshot:

Where are you reading the filtered data from? what should these look like?

@nchaveiro wrote the virtual feeds implementation in emoncms and may be able to help.

@TrystanLea

The filtered data is just a cleaned up version of what’s in the ram dump. I cleaned it up to make it more readable.

Dont think it has any relation to virtual feeds as the memory dumps point to persisting the mqtt data.
Just a guess but may be related to the php mqtt mosquitto driver having memory leaks or not performing garbage colecting.
See Memory garbage collection · Issue #32 · mgdm/Mosquitto-PHP · GitHub

Maybe try with other mqtt php client library.

I have asked this before, but I will ask again. What are the QoS settings for each of the MQTT sources?

The openEVSE project uses the MQTT lib at GitHub - knolleary/pubsubclient: A client library for the Arduino Ethernet Shield that provides support for MQTT. which states in the readme “It can only publish QoS 0 messages. It can subscribe at QoS 0 or QoS 1.” so it maybe safe to assume that the data from openEVSE is QoS 0.

To my knowledge IoTaWatt doesn’t support MQTT, if it does, or if you have implemented something yourself, what are the QoS settings?

The emonpi variant of emonhub uses QoS 2. So all the emonhub sourced data will be QoS 2.

I have no idea what the nodered position is on MQTT QoS, I would expect it to be user defined, or dependent on the flow/node/module used.

QoS 0 is basically a fire and forget “broadcast” of the data, it is not stored in memory and no one cares if it is received once it is published. This might be why OpenEVSE doesn’t appear in the memory dump.

QoS 2 is “Exactly once” which means that copies hang around until receipt is confirmed not just once but twice. it forces a 2 way communication of 4 messages between sender and receiver.

Although the QoS for any given message is set by the sender, When the receiver (emoncms) subscribes to MQTT it asks for a QoS Level (emoncms appears to ask for QoS2) and the lower QoS will prevail.

So although emoncms asks for QoS2, when openEVSE publishes a QoS0 message, it arrives at emoncms as a QoS0. But when the emonpi variant of emonhub publishes a QoS2 message, the QoS2 level is maintained and the receiver (emoncms) will need to provide 2 confirmations for the message to be deleted from the sender (broker).

But I’m wondering if the QoS2 is causing an issue for emoncms if all 4 interactions are not completing.

It would help a great deal to know what the QoS levels are for all the MQTT sources to see if there is a pattern eg all QoS0 ok and all QoS2 not ok. I’m not suggesting that any level of QoS is wrong or correct, just looking for a pattern to help debug.

Some tests publishing test data via the commandline with various QoS levels might prove or disprove the theory.

Maybe the fact openEVSE (QoS0) is ok and emonhub (QoS2) isn’t is just a coincidence, but since MQTT has a mechanism for retaining topics and not others depending on their QoS level, when asking “why do some topics get retained and not others” it seems a logical place to look, even without the coincidental cases.

For another test you could also try editing emoncms to be QoS0 for a while, what’s the worst that can happen? some data gets missed?

Same for emonhub, try lowering the QoS for a test (not at the same time as emoncms).

(I have no idea why it’s been hardcoded in 3 different places (lines 122, 134 & 153) rather than using a variable or even a single publish function.)

Originally the QoS was 0 in both emoncms (emoncms/scripts/phpmqtt_input.php at 1f280ca54b9c94c935ade5a389c4db47a18a86bf · emoncms/emoncms · GitHub) and the emonpi variant of emonhub (emonhub/src/interfacers/EmonHubMqttInterfacer.py at c5590e00c440d518d55413a09336bdfd1e037ada · openenergymonitor/emonhub · GitHub).

@pb66

All of the NodeRed stuff that pub’s to MQTT (Solax, Solax 2, and weather) is set to QOS “0” and Retain “False”.

Also, IoTaWatt is able to pub directly to EmonCMS. I am unsure as to its QOS.

This seems to be more of a bug then a setting as the data that is appearing in the ram is incomplete.

I haven’t suggested otherwise. I’m suggesting you explore the only useful info you currently have right now and that is that some nodes are affected and some are not, that doesn’t suggest a randomness. If you where able to get the same node to be held in RAM or not, depending on a change of QoS setting, topic length, payload content or update frequency etc, you have reduced the field considerably.

I’m suggesting you shake the tree and see what falls out, If you try nothing and nothing changes, you have learnt nothing and made no progress. If you do try something, even if it doesn’t change anything, you’ve learn’t something and that’s progress, even if it is just ruling something out.

Have you checked the logs or set something up to confirm that everything is ok going into emoncms? Are there any missing datapoints in the feeds? Are the posts stuck in RAM processed or not?

I previously asked for data on all the nodes, update intervals, topics and payloads.

Have you tried subscribing to the emoncms topic from the command line (using QoS 2) to confirm there is nothing funky going on with the data going into emoncms?

Have you tried stopping any of the mqtt nodes to see if it reduces the issues?

Have you tried setting up a test node with a simple payload of a incrementing packet id or timestamp so that you can see if data is getting lost along the route? Does that test node appear in the memory dump?

You suggest this fault is present all the time, starts at boot and grows. This should give you loads of opportunity as you can try something and know almost immediately if it’s made a difference, you are not at the mercy of the fault occurring periodically at an unknown interval.

I can find no mention of MQTT in the IoTaWatt repo and there are several posts on the IoTaWatt forum in March confirming it wasn’t available at that time, so I think mentioning IoTaWatt here maybe a red herring as it is most likely posting via http and irrelevant to this discussion.

To my knowledge no one else is experiencing this, so whilst I agree this is a bug, I am also fairly convinced it is apparent on this system due to a certain way it’s configured or because of the data that’s being passed, the largest part of diagnosing this issue is to identify why it is happening on this setup and perhaps only to certain nodes. That can only be done by looking closely at the finer details of this setup and playing with the settings and/or data content to establish what is failing and why, then a fix can be administered so that the current setup does work. I’m in no way suggesting the issue is due to an incorrect user setting.

just as an observation, it looks like every one of the lines in your “filtered data” above has a space or special character in it. What are we looking at there? is it all topic+value or are some of the payloads multi value? Why are there so many colons? is the data corrupt as well as truncated? what does a good topic+payload look like? I would expect to see braces if there are colons, but I can’t find any example code right now.

The bulk of the lines come from “Solax 2” which contains a space, whilst that is valid MQTT it is bad practice as spaces cause issues else where, perhaps emoncms mqtt input is struggling with the spaces? It tends to everywhere else in emoncms.

Don’t use spaces in a topic
A space is the natural enemy of each programmer, they often make it much harder to read and debug topics, when things are not going the way, they should be. So similar to the first one, only because something is allowed doesn’t mean it should be used. UTF-8 knows many different white space types, it’s pretty obvious that such uncommon characters should be avoided.
(from MQTT Topics, Wildcards, & Best Practices – MQTT Essentials: Part 5)