Phantom Duplicated MQTT Topics after reboot

I’m running a version of the EMONPI image created following the instructions from around February / March 2016. On the whole it works well, however occasionally after a reboot I get duplicate Inputs named the same as existing ones. Unfortunately the duplicates receive the incoming messages from my EMONTX’s with the originals topics which are assigned to feeds being stagnant. Deleting the new inputs reactivates the old ones.
It was discussed on the old forum in MQTT Duplicate Inputs created after several reboots | Archived Forum, was anything resolved?
I see the problem both with incoming inputs from EMONHUB and with Topics created by the publish to MQTT process which EMONHUB picks up and displays as new inputs.

I’m afraid this has not been resolved. I am aware of this issue but no quite sure what’s causing it!

How often has this happened for you?

Do the new inputs have identical names to the old ones?

Is it only on reboot?

May be related but in the opposite direction; I create a new MQTT topic in Nodered and while it turns up in the Inputs list on the emonpi it does not get put into the input table in MySQL, which I only found after trying to log to a new feed. A reboot fixed it without further action required.

It’s hard to debug something that happens so rarely. I’ve only seen this happen once in 100’s of reboots on many emonPi’s!

Just to summarise where we are at:

ISSUE

A new set of Emoncms Inputs for the same node are created by the Emoncms MQTT input script. These inputs have exactly the same names as the old ones. The MQTT input script obtains the names of the inputs from the MQTT topic names.

e.g The MQTT topic emon/emontx/power1 results in an Emoncms Input called power1 from emontx node. Any data published to MQTT base topic (default emon/) will appear in Emoncms inputs.

Here is a screen grab illustrating the issue, when this occurs the user needs to move their input processes to the newly created inputs to keep their feeds updating:

POSSIBLE CAUSES

  1. Publisher - emonHub publishing a slightly different mqtt topic name
  2. Broker - Mosquitto MQTT broker corrupting the topic name somehow
  3. Subscriber - Emoncms PHP input script which uses Mosquitto-PHP library corrupting/changing the topic name when subscribing

If I had to guess I would guess the that subscriber (3.) would be the culprit. The Mosquitto Debian repository is called libmosquitto-dev and the PHP Mosquitto library is called Mosquitto-alpha . Dev and alpha don’t inspire confidence!

See how MQTT input script and Mosquitto PHP is setup on the latest emonPi image here.


Debug

I suppose a way to try and get to the bottom of this would be to setup a log file to log all MQTT messages then when the issues occurs try and see if it holds any clues. If the MQTT messages all look correct then the problem must be with (3.) the subscribers. I will setup a test…

Update:

If anyone is interested I have used nodeRED that’s built into the latest emonPi image to log all MQTT messages on the emon/ topic to a log /mnt/1TB/log/emonpimqtt.log file together with the topic name, payload and time + date. I have mounted an external hard-drive to on the emonPi to preserve space on the emonPi and avoid hammering the SD card. This log is going to get BIG!

Here is the nodeRED flow:

[{"id":"286b0130.d794fe","type":"mqtt-broker","z":"","broker":"localhost","port":"1883","clientid":"","usetls":false,"verifyservercert":true,"compatmode":true,"keepalive":"15","cleansession":true,"willTopic":"","willQos":"0","willRetain":null,"willPayload":"","birthTopic":"","birthQos":"0","birthRetain":null,"birthPayload":""},{"id":"6e2c3d80.227114","type":"file","z":"41656479.be9a9c","name":"","filename":"/mnt/1TB/log/emonpimqtt.log","appendNewline":true,"createDir":false,"overwriteFile":"false","x":890,"y":290,"wires":[]},{"id":"c58edf00.c2873","type":"mqtt in","z":"41656479.be9a9c","name":"","topic":"emon/#","broker":"286b0130.d794fe","x":446,"y":304,"wires":[["18439c97.f2f4d3"]]},{"id":"18439c97.f2f4d3","type":"function","z":"41656479.be9a9c","name":"","func":"var time = new Date().toString();\nvar newMsg = { payload: msg.topic + \",\" + msg.payload +\",\" + time};\nreturn newMsg;","outputs":1,"noerr":0,"x":610,"y":294,"wires":[["f5833444.a2dc98","6e2c3d80.227114"]]},{"id":"f5833444.a2dc98","type":"debug","z":"41656479.be9a9c","name":"","active":false,"console":"false","complete":"false","x":731,"y":201,"wires":[]}]

@glyn.hudson
to answer your questions

How often has this happened for you? It can happen a few times a day if I’m rebooting a lot.
It seems to happen more often after repeated reboots, I’d say much more if it’s been up for only a couple of minutes - I run Openhab on the same box and changing the items means I have to reboot, I also was chasing a stack dump on startup problem caused I think by Java maxing 2 of the CPU cores, so it could have been up for less than a couple of minutes at a time. It is often just one or 2 inputs get duplicated. If I’m not rebooting it doesn’t seem to happen.

Do the new inputs have identical names to the old ones? Yes, I’ve seen duplicate topics also appear in “inputs” that are created by the publish to MQTT action, and as they still work in Openhab I expect they have the same name.

Is it only on reboot? Yes

Fortunately I’ve solved my stackdump problem in Openhab so its nowhere near as often as it was, but I still occasionally get it on a reboot.

Hopefully this helps!

Interesting, that’s for the invite. Have emonSD build are you running?

Yes, good point. I’m also running openHAB and have noticed this. This proves that the new mqtt topic name is exactly the same. Maybe the name just changed for a second or two which triggered emoncms to created new inputs.

@glyn.hudson I’m running a home brew image based around your EMONPI image build instructions from the middle of February. To this I added the new MQTT setup from the forum instructions, and also the feed creation and mqtt_input fixes. Its running on a Pi3 with RFM69 without the display script as I need access to all my GPIO’s. It also doesn’t have Node Red or Lightwave.

1 Like

I just had this duplicate name for the first time - I was updating node-red to publish more topics under the same hierarchy and the original input went n/a and three new ones popped up, one with the same name. Could this be something to do with the way the MQTT subscriber iterates through multiple inputs?

@glyn.hudson One thing I have discovered, whilst looking for something completely different, is in Openhab.cfg I had

 mqtt:mosquitto.retain=true

set which is odd as I don’t remember ever setting it. I’ve now set

mqtt:mosquitto.retain=false

Will let you know if any duplicates arise with the new setting - as my Openhab is mature it doesn’t get rebooted as often as it used to so it may be a few days before any problems show up.

mqtt:mosquitto.retain=true

is set in the default openenergymonitor openhab config

How do you think this could be related to the issue we are discussing?

@glyn.hudson I’m not entirely sure, although I seem to remember someone mentioned retained messages as a possible problem when I reread the thread on the old forum before I started this one
I run a non OEM copy of Openhab, and the value on my system was set, yet on the line above in openhab.cfg it says:

# Optional. True or false. Defines if the broker should retain the messages sent to
# it. Defaults to false.

Suggesting that “true” isn’t a default.
In the V1.8 Runtime I just found in my downloads folder it isn’t set at all, and as I don’t remember setting it, its either a default from the V1.7 days or something I’ve done whilst troubleshooting a long time ago, and definitely before the MQTT changes that lead to the problem appearing.
I can understand how the retained messages setting would be useful, but on a read only file system its likely that it doesn’t work too well unless wherever it retains them has been moved to a writable location. In my case Openhab lives in /home/pi/data/openhab so it can write what it wants, but not knowing about this setting until now everything is setup to not use it.
I’ll leave it at false and report what happens.

1 Like

I raised the issue of retained messages here as the likely cause back in March.
Further reading about them can be found in this guide.
@kevin - even though you’ve set it to false, you may need to delete the retained messages which are slushing about in your system. The guide link above explains how, and I flushed mine by setting up a MQTT node in node-red for each of the ‘ghost’ MQTT ID’s, with a zero byte payload and retain flag set.
I changed the ID for each of the ghosts in turn, sending just one message to each, and they have not returned since.

As a test, why not just delete half of them initially, and then see what, if any ghosts reappear.

I described my issue in this post in the node-red forum, and was kindly assisted by Dave Conway-Jones (IBM nodered developer).

Paul

1 Like

Sorry I missed your comment regarding retained messages the first time round.

I have now changed mqtt:mosquitto.retain=false in the default openhab config. This change will get pulled in by the image image since openhab.cfg is symlinked eg.

/etc/openhab/configurations/openhab.cfg -> /home/pi/oem_openHab/openhab.cfg

Unless of course a users has made custom changes to the config. In that case they will need to make the change manually.

@Paul @glyn.hudson, when I looked into this back then I noted that mosquito.conf referenced a retained messages file on the read only part of the file system. Is it worthwhile amending this and the documentation so it is redirected into somewhere like /tmp that is stores in RAM assuming the permissions are set correctly? That way it can’t retain anything between reboots. (Never quite finished working on this back then due to life getting in the way)

Persistence is turned off in Mosquito by default on the current emonSD image. Your correct that persistence_location /var/lib/mosquitto/, is mentioned however since persistence is turned off this location will not be used.

Here is the default mosquitto conf on the image.:

[email protected]:~ $ cat /etc/mosquitto/mosquitto.conf 
# Place your local configuration in /etc/mosquitto/conf.d/
#
# A full description of the configuration file is at
# /usr/share/doc/mosquitto/examples/mosquitto.conf.example

pid_file /var/run/mosquitto.pid

persistence false
persistence_location /var/lib/mosquitto/

log_dest file /var/log/mosquitto/mosquitto.log

include_dir /etc/mosquitto/conf.d

allow_anonymous false
password_file /etc/mosquitto/passwd

@glyn perhaps this is where Openhab tried to store messages when persistence=true was set so after every reboot it got a few messages from the last time the file system was writable

I’ve been looking at this issue in my local installation. I think the issue has to do with phpmqtt_input.php and input_model.php in the EmonCMS project. After a reboot, they load a list of node names and input names from Redis (assuming it is enabled), and use that list of channels as a basis for deciding when to create new inputs in the future.

The problem is (at least on my installation) that Redis doesn’t appear to save that info across reboots, and so it feeds an empty list of nodes and channels into EmonCMS.

I switched the code to load the list from MySQL, and suddenly everything is back to normal (almost, I did have to delete all of the extra channels from MySQL).

I’m not sure if this was the right way to fix this, but hopefully it helps someone figure it out. I edited input_model.php around line 126 to always use the MySQL option.

Sorry to wake an old thread but did anything happen with tracking this down in the end?

I recently “lost” my emonpi for a while after a brown out caused it to shutdown and the network switch is was connected to hard failed. Now replaced, plugged the SD card into another RPi and let the long fsck complete before putting it back in place and now I have all duplicated inputs.

Before I manually fix this I am doing a full backup and update, but would love to know if this will likely happen again?

Ah! What I have discovered is simply deleting the “new” phantom entries, the old ones start working again. Phew. :slight_smile:

1 Like

I have this a lot, and I just realised that my emonpi I rebooted 2 days ago had a severe case of this and messed up all my logging :-(.

I could be wrong, but my hunch is that Mosquitto PHP client is multi-threaded, and it’s calling our message function multiple times here:

And then we have a race.

However it wouldn’t account for why the original inputs are not seen as they should be loaded in the get_inputs call.

Or maybe the get_inputs call is returning nothing… That would explain it. Could redis be not alive before this gets called at boot up?